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Title Of Invention 

Cloned Genomes Of Infectious 
Hepatitis C Viruses And Uses Thereof 

This application claims the benefit of U.S. 
Provisional Application No. 60/053,062 filed July 18, 
1997. 

Field Of Invention 

The present invention relates to molecular 
approaches to the production of nucleic acid sequences 
which comprise the genome of infectious hepatitis C 
viruses. In particular, the invention provides nucleic 
acid sequences which comprise the genomes of infectious 
hepatitis C viruses of genotype la and lb strains. The 
invention therefore relates to the use of these sequences, 
and polypeptides encoded by all or part of these 
sequences, in the development of vaccines and diagnostic 
assays for HCV and in the development of screening assays 
for the identification of antiviral agents for HCV. 

Backgrou nd Of Invention 

Hepatitis C virus (HCV) has a positive- sense 
single- strand RNA genome and is a member of the virus 
family Flaviviridae (Choo et al., 1991; Rice, 1996). As 
for all positive-stranded RNA viruses, the genome of HCV 
functions as mRNA from which all viral proteins necessary 
for propagation are translated. 

The viral genome of HCV is approximately 9600 
nucleotides (nts) and consists of a highly conserved 5' 
untranslated region (UTR) , a single long open reading 
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frame (ORF) of approximately 9,000 nts and a complex 3' 
UTR. The 5' UTR contains an internal ribosomal entry site 
(Tsukiyama-Kohara et al. # 1992; Honda et al., 1996). The 
3' UTR consists of a short variable region, a 
polypyrimidine tract of variable length and, at the 3' 
end, a highly conserved region of approximately 100 nts 
(Kolykhalov et al. f 1996; Tanaka et al., 1995; Tanaka et 
al., 1996; Yamada et al . , 1996). The last 46 nucleotides 
of this conserved region were predicted to form a stable 
stem- loop structure thought to be critical for viral 
replication (Blight and Rice, 1997; Ito and Lai, 1997; 
Tsuchihara et al., 1997). The ORF encodes a large 
15 polypeptide precursor that is cleaved into at least 10 

proteins by host and viral proteinases (Rice, 1996) . The 
predicted envelope proteins contain several conserved N- 
linked glycosylation sites and cysteine residues (Okamoto 
et al., 1992a). The NS3 gene encodes a serine protease 

20 

and an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase . 

Globally, six major HCV genotypes (genotypes 1- 
6) and multiple subtypes (a, b, c, etc.) have been 

25 identified (Bukh et al . , 1993; Simmonds et al., 1993). 

The most divergent HCV isolates differ from each other by 
more than 30% over the entire genome (Okamoto et al . , 
1992a) and HCV circulates in an infected individual as a 

2 Q quasispecies of closely related genomes (Bukh et al., 
1995; Farci et al . , 1997). 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 

35 
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risk of developing chronic hepatitis, liver cirrhosis and 
hepatocellular carcinoma (Hoofnagle, 1997). In the U.S., 
HCV genotypes la and lb constitute the majority of 
infections while in many other areas, especially in Europe 

5 

and Japan, genotype lb predominates. 

The only effective therapy for chronic hepatitis 
C, interferon (I FN) , induces a sustained response in less 
than 25% of treated patients (Fried and Hoofnagle, 1995) . 

10 Consequently, HCV is currently the most common cause of 
end stage liver failure and the reason for about 30% of 
liver transplants performed in the U.S. (Hoofnagle, 1997) . 
In addition, a number of recent studies suggested that the 
severity of liver disease and the outcome of therapy may 
be genotype -dependent (reviewed in Bukh et al . , 1997). In 
particular, these studies suggested that infection with 
HCV genotype lb was associated with more severe liver 
disease (Brechot, 1997) and a poorer response to I FN 

20 therapy (Fried and Hoofnagle, 1995) . As a result of the 
inability to develop a universally effective therapy 
against HCV infection, it is estimated that there are 
still more than 25,000 new infections yearly in the U.S. 
(Alter 1997) Moreover, since there is no vaccine for HCV, 
HCV remains a serious public health problem. 

However, despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell culture 

30 system and the lack of any small animal model for 

laboratory study. For example, while replication of HCV 
in several cell lines has been reported, such observations 
have turned out not to be highly reproducible. In 
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addition, the chimpanzee is the only animal model, other 
than man, for this disease. Consequently, HCV has been 
able to be studied only by using clinical materials 
obtained from patients or experimentally infected 
chimpanzees (an animal model whose availability is very 
limited) . 

However, several researchers have recently 
reported the construction of infectious cDNA clones of 

10 HCV, the identification of which would permit a more 

effective search for susceptible cell lines and facilitate 
molecular analysis of the viral genes and their function. 
For example, Dash et al . , (1997) and Yoo et al . , (1995) 
reported that RNA transcripts from cDNA clones of HCV-1 

15 (genotype la) and HCV-N (genotype lb), respectively, 
resulted in viral replication after transfection into 
human hepatoma cell lines. Unfortunately, the viability 
of these clones was not tested in vivo and concerns were 

20 raised about the infectivity of these cDNA clones in vitro 
(Fausto, 1997) . In addition, both clones did not contain 
the terminal 98 conserved nucleotides at the very 3' end 
of the UTR. 

25 Kolykhalov et al., (1997) and Yanagi et al . 

(1997) reported the derivation from HCV strain H77 (which 
is genotype la) of cDNA clones of HCV that are infectious 
for chimpanzees. However, while these infectious clones 
will aid in studying HCV replication and pathogenesis and 

30 will provide an important tool for development of in vitro 
replication and propagation systems, it is important to 
have infectious clones of more than one genotype given the 
extensive genetic heterogeneity of HCV and the potential 
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impact of such heterogeneity on the development of 
effective therapies and vaccines for HCV. 

Summary Of The Invention 

5 

The present invention relates to nucleic acid 
sequences which comprise the genome of infectious 
hepatitis C viruses and in particular, nucleic acid 
sequences which comprise the genome of infectious 
10 hepatitis C viruses of genotype la and lb strains. It is 
therefore an object of the invention to provide nucleic 
acid sequences which encode infectious hepatitis C 
viruses. Such nucleic acid sequences are referred to 
throughout the application as "infectious nucleic acid 
sequences" . 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA, cDNA or any variant 
thereof capable of directing host organism synthesis of a 

20 hepatitis C virus polypeptide. It is understood that 

nucleic acid sequence encompasses nucleic acid sequences, 
which due to degeneracy, encode the same polypeptide 
sequence as the nucleic acid sequences described herein. 

25 The invention also relates to the use of the 

infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading frames 
of infectious nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 

^° and 6) and subtypes (including, but not limited to, 

subtypes la, lb, 2a, 2b, 2c, 3a 4a-4f , 5a and 6a) of HCV. 
For example infectious nucleic acid sequence of the la and 
lb strains H77 and HC-J4, respectively, described herein 
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can be used to produce chimeras with sequences from the 
genomes of other strains of HCV from different genotypes 
or subtypes. Nucleic acid sequences which comprise 
sequence from the open-reading frames of 2 or more HCV 
5 genotypes or subtypes are designated "chimeric nucleic 
acid sequences". 

The invention further relates to mutations of 
the infectious nucleic acid sequences of the invention 
10 where mutation includes, but is riot limited to, point 

mutations, deletions and insertions. In one embodiment, a 
gene or fragment thereof can be deleted to determine the 
effect of the deleted gene or genes on the properties of 
the encoded virus such as its virulence and its ability to 
15 replicate. In an alternative embodiment, a mutation may 
be introduced into the infectious nucleic acid sequences 
to examine the effect of the mutation on the properties of 
the virus in the host cell. 
20 The invention also relates to the introduction 

of mutations or deletions into the infectious nucleic acid 
sequences in order to produce an attenuated hepatitis C 
virus suitable for vaccine development. 

The invention further relates to the use of the 

25 

infectious nucleic acid sequences to produce attenuated 
viruses via passage in vitro or in vivo of the viruses 
produced by transfection of a host cell with the 
infectious nucleic acid sequence. 
30 The present invention also relates to the use of 

the nucleic acid sequences of the invention or fragments 
thereof in the production of polypeptides where "nucleic 
acid sequences of the invention" refers to infectious 
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nucleic acid sequences, mutations of infectious nucleic 
acid sequences, chimeric nucleic acid sequences and 
sequences which comprise the genome of attenuated viruses 
produced from the infectious nucleic acid sequences of the 
5 invent ion . The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the development 
of diagnostic assays for detecting the presence of HCV in 
10 biological samples. 

The invention therefore also relates to vaccines 
for use in immunizing mammals especially humans against 
hepatitis C. In one embodiment, the vaccine comprises one 
or more polypeptides made from a nucleic acid sequence of 
^ the invention or fragment thereof . In a second 

embodiment, the vaccine comprises a hepatitis C virus 
produced by transfection of host cells with the nucleic 
acid sequences of the invention. 
20 The present invention therefore relates to 

methods for preventing hepatitis C in a mammal. In one 
embodiment the method comprises administering to a mammal 
a polypeptide or polypeptides encoded by a nucleic acid 
sequence of the invention in an amount effective to induce 
protective immunity to hepatitis C. In another 
embodiment, the method of prevention comprises 
administering to a mammal a hepatitis C virus of the 
invention in an amount effective to induce protective 
30 immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal a nucleic 
acid sequence of the invention or a fragment thereof in an 
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amount effective to induce protective immunity against 
hepatitis C. 

The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
5 nucleic acid sequences of the present invention. 

The invention therefore also provides 
pharmaceutical compositions comprising the nucleic acid 
sequences of the invention and/or their encoded hepatitis 
10 C viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequences of the invention or fragments 
thereof. The pharmaceutical compositions of the invention 
may be used prophylactically or therapeutically. 
15 The invention also relates to antibodies to the 

hepatitis C viruses of the invention or their encoded 
polypeptides and to pharmaceutical compositions comprising 
these antibodies. 
20 The present invention further relates to 

polypeptides encoded by the nucleic acid sequences of the 
invention fragments thereof. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
25 purified from hepatitis C virus produced by cells 

transfected with nucleic acid sequence of the invention. 
In another embodiment, the polypeptide or polypeptides are 
produced recombinantly from a fragment of the nucleic acid 
sequences of the invention. In yet another embodiment, 
30 the polypeptides are chemically synthesized. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
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lines capable of supporting the replication of HCV in 
vitro . 

The invention further relates to the use of the 
nucleic acid sequences of the invention or their encoded 
5 proteases (e.g. NS3 protease) to develop screening assays 
to identify antiviral agents for HCV. 

Brief Description Of Figures 

10 Figure 1 shows a strategy for constructing full- 

length cDNA clones of HCV strain H77. The long PCR 
products amplified with HI and H9417R primers were cloned 
directly into pGEM-9zf (-) after digestion with Not I and 
Xba I (pH21j and pHSOj) . Next, the 3' UTR was cloned into 
both pH21j and pHSOj after digestion with Afl II and Xba I 
(pH21 and pH50) . pH21 was tested for infectivity in a 
chimpanzee. To improve the efficiency of cloning, we 
constructed a cassette vector with consensus 5' and 3' 

^ termini of H77 . This cassette vector (pCV) was obtained 
by cutting out the BamHI fragment (nts 1358- 7530 of the 
H77 genome) from pH50, followed by religation. Finally, 
the long PCR products of H77 amplified with primers HI and 

25 H9417R (H product) or primers Al and H9417R (A product) 

were cloned into pCV after digestion with Age I and Afl II 
or with Pin AI and Bf r I . The latter procedure yielded 
multiple complete cDNA clones of strain H77 of HCV. 
Figure 2 shows the results of gel 

30 

electrophoresis of long RT-PCR amplicons of the entire ORF 
of H77 and the transcription mixture of the infectious 
clone of H77. The complete ORF was amplified by long RT- 
PCR with the primers HI or Al and H9417R from 10 5 GE of 
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H77. A total of 10 (ig of the consensus chimeric clone 
(pCV-H77C) linearized with Xba I was transcribed in a 100 
/xl reaction with T7 RNA polymerase. Five /xl of the 
transcription mixture was analyzed by gel electrophoresis 
and the remainder of the mixture was injected into a 
chimpanzee. Lane 1, molecular weight marker ; lane 2, 
products amplified with primers HI and H9417R; lane 3, 
products amplified with primers Al and H9417R; lane 4, 
10 transcription mixture containing the RNA transcripts and 
linearized clone pCV-H77C (12.5 kb) . 

Figure 3 is a diagram of the genome organization 
of HCV strain H77 and the genetic heterogeneity of 
individual full-length clones compared with the consensus 

15 

sequence of H77 . Solid lines represent aa changes. 
Dashed lines represent silent mutations. A * in pH21 
represents a point mutation at nt 58 in the 5' UTR. In 
the ORF, the consensus chimeric clone pCV-H77C had 11 nt 

20 differences [at positions 1625 (C->T) , 2709 (T->C) , 3380 

(A->G) , 3710 (C-»T), 3914 (G->A) , 4463 (T-»C) , 5058 (C->T) , 
5834 (C-»T) , 6734 (T-»C) , 7154 (C-»T) , and 7202 (T-»C) ] and 
one aa change (F -> L at aa 790) compared with the 

25 consensus sequence of H77 . This clone was infectious. 

Clone pH21 and pCV-Hll had 19 nts (7 aa) and 64 nts (21 
aa) differences respectively, compared with the consensus 
sequence of H77. These two clones were not infectious. A 

30 single point mutation in the 3' UTR at nucleotide 9406 

(G-»A) introduced to create an Afl II cleavage site is not 
shown . 
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Figures 4A-4F show the complete nucleotide 
sequence of a H77C clone produced according to the present 
invention and Figures 4G-4H show the amino acid sequence 
encoded by the H77C clone. 
^ Figure 5 shows an agarose gel of long RT-PCR 

amplicons and transcription mixtures. Lanes 1 and 4: 
Molecular weight marker ( Lambda / Hindi I I digest) . Lanes 2 
and 3: RT-PCR amplicons of the entire ORF of HC- J4 . Lane 
10 5: pCV-H77C transcription control (Yanagi et al., 1997). 
Lanes 6, 7 , and 8: 1/40 of each transcription mixture of 
pCV-J4L2S, pCV-J4L4S and pCV-J4L6S, respectively, which 
was injected into the chimpanzee. 

Figure 6 shows the strategy utilized for the 
construction of full-length cDNA clones of HCV strain HC- 
J4 . The long PCR products were cloned as two separate 
fragments (L and S) into a cassette vector (pCV) with 
fixed 5' and 3' termini of HCV (Yanagi et al., 1997). 
20 Full-length cDNA clones of HC-J4 were obtained by 

inserting the L fragment from three pCV-J4L clones into 
three identical pCV-J4S9 clones after digestion with 
PinAI (isoschizomer of Agrel) and Bfrl ( isoschizomer of 
Aflll) . 

Figure 7 shows amino acid positions with a 
quasispecies of HC-J4 in the acute phase plasma pool 
obtained from an experimentally infected chimpanzee. 
Cons-p9: consensus amino acid sequence deduced from 
30 analysis of nine L fragments and nine S fragments (see 

Fig. 6) . Cons-D: consensus sequence derived from direct 
sequencing of the PCR product. A, B, C: groups of similar 
viral species. Dot: amino acid identical to that in Cons- 
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p9. Capital letter: amino acid different from that in 
Cons-p9. Cons-F: composite consensus amino acid sequence 
combining Cons-p9 and Cons-D. Boxed amino acid: different 
from that in Cons-F. Shaded amino acid: different from 
5 that in all species A sequences. An *: defective ORF due 
to a nucleotide deletion (clone LI, aa 1097) or insertion 
(clone L7, aa 2770) . Diagonal lines: fragments used to 
construct the infectious clone. 
I0 Figure 8 shows comparisons (percent difference) 

of nucleotide (nts. 156 - 8935) and predicted amino acid 
sequences (aa 1 - 2864) of L clones (species A, B, and C, 
this study), HC-J4/91 (Okamoto et al., 1992b) and HC-J4/83 
(Okamoto et al . , 1992b). Differences among species A 
sequences and among species B sequences are shaded. 

Figure 9 shows UPGMA ("unweighted pair group 
method with arithmetic mean") trees of HC-J4/91 (Okamoto 
et al., 1992b), HC-J4/83 (Okamoto et al., 1992b), two 
20 prototype strains of genotype lb (HCV-J, Kato et al., 

1990; HCV-BK, Takamizawa et al . , 1991), and L clones (this 
study) . 

Figure 10 shows the alignment of the HVR1 and 
HVR2 amino acid sequences of the E2 sequences of nine L 
clones of HC-J4 (species A, B, and C) obtained from an 
early acute phase plasma pool of an experimentally 
infected chimpanzee compared with the sequences of eight 
clones (HC-J4/91-20 through HC-J4/91-27, Okamoto et al., 
30 1992b) derived from the inoculum. Dot: an amino acid 

identical to that in the top line. Capital letters: amino 
acid different from that in the top line. 
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Figure 11 shows the alignment of the 5' UTR and 
the 3' UTR sequences of infectious clones of genotype la 
(pCV-H77C) and lb (pCV-J4L6S) . Top line: consensus 
sequence of the indicated strain. Dot: identity with 
consensus sequence. Capital letter: different from the 
consensus sequence. Dash: deletion. Underlined: PinAI 
and Bfrl cleavage site. Numbering corresponds to the HCV 
sequence of pCV-J4L6S. 
10 Figure 12 shows a comparison of individual full- 

length cDNA clones of the ORF of HCV strain HC-J4 with 
the consensus sequence (see Fig. 7). Solid lines: amino 
acid changes. Dashed lines: silent mutations. Clone pCV- 
J4L6S was infectious in vivo whereas clones pCV-J4L2S and 
pCV-J4L4S were not. 

Figure 13 shows biochemical (ALT levels) and PCR 
analyses of a chimpanzee following percutaneous 
intrahepatic transfection with RNA transcripts of the 
infectious clone of pCV-J4L2S, pCV-J4L4S and pCV-J4L6S. 
The ALT serum enzyme levels were measured in units per 
liter (u/1) . For the PCR analysis, "HCV RNA" represented 
by an open rectangle indicates a serum sample that was 
negative for HCV after nested PCR; "HCV RNA" represented 
by a closed rectangle indicates that the serum sample was 
positive for HCV and HCV GE titer on the right-hand y-axis 
represents genome equivalents. 

Figures 14A-14F show the nucleotide sequence of 
the infectious clone of genotype lb strain HC-J4 and 
Figures 14G-14H show the amino acid sequence encoded by 
the HC-J4 clone. 
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Figure 15 shows the strategy for constructing a 
chimeric HCV clone designated pH77CV-J4 which contains the 
nonstructural region of the infectious clone of genotype 
la strain H77 and the structural region of the infectious 
5 clone of genotype lb strain HC- J4 . 

Figures 16A-16F show the nucleotide sequence of 
the chimeric la/lb clone pH77CV-J4 of Figure 15 and 
Figures 16G-16H show the amino acid sequence encoded by 
10 the chimeric la/lb clone. 

Figures 17A and 17B show the sequence of the 3' 
untranslated region remaining in various 3' deletion 
mutants of the la infectious clone pCV-H77C and the 
15 strategy utilized in constructing each 3' deletion mutant 
(Figures 17C-17G) . 

Of the seven deletion mutants shown, two (pCV- 
H77C(-98X) and (-42X)) have been constructed and tested 
for infectivity in chimpanzees (see Figures 17A and 17C) 

20 

and the other six are to be constructed and tested for 
infectivity as described in Figures 17D-17G. 

Figures 18A and 18B show biochemical (ALT 
levels) , PCR (HCV RNA and HCV GE titer) , serological 
25 (anti-HCV) and histopathological (Fig. 18B only) analyses 
of chimpanzees 1494 (Fig. 18A) and 1530 (Fig. 18B) 
following transfection with the infectious cDNA clone pCV- 
H77C. 

The ALT serum enzyme levels were measured in 
units per ml (u/1) . For the PCR analysis, "HCV RNA" 
represented by an open rectangle indicates a serum sample 
that was negative for HCV after nested PCR; "HCV RNA" 
represented by a closed rectangle indicates that the serum 
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sample was positive for HCV; and HCV GE titer on the 
right-hand y-axis represents genome equivalents. 

The bar marked "anti-HCV" indicates samples that 
were positive for anti-HCV antibodies as determined by 
5 commercial assays. The histopathology scores in Figure 
18B correspond to no histopathology (O) , mild hepatitis 
((f) and moderate to severe hepatitis (•) . 

DESCRIPTION OF THE INVENTION 

10 

The present invention relates to nucleic acid 
sequences which comprise the genome of an infectious 
hepatitis C virus. More specifically, the invention 
^ relates to nucleic acid sequences which encode infectious 
hepatitis C viruses of genotype la and lb strains. In one 
embodiment, the infectious nucleic acid sequence of the 
invention has the sequence shown in Figures 4A-4F of this 
application. In another embodiment, the infectious 
20 nucleic acid sequence has the sequence shown in Figures 

14A-14F and is contained in a plasmid construct deposited 
with the American Type Culture Collection (ATCC) on 
January 26, 1998 and having ATCC accession number 209596 . 
25 The invention also relates to "chimeric nucleic 

acid sequences" where the chimeric nucleic acid sequences 
consist of open-reading frame sequences taken from 
infectious nucleic acid sequences of hepatitis C viruses 
of different genotypes or subtypes. 
^ In one embodiment, the chimeric nucleic acid 

sequence consists of sequence from the genome of an HCV 
strain belonging to one genotype or subtype which encodes 
structural polypeptides and sequence of an HCV strain 
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belonging to another genotype strain or subtype which 
encodes nonstructural polypeptides. Such chimeras can be 
produced by standard techniques of restriction digestion, 
PCR amplification and subcloning known to those of 
5 ordinary skill in the art. 

In a preferred embodiment, the sequence encoding 
nonstructural polypeptides is from an infectious nucleic 
acid sequence encoding a genotype la strain where the 
10 construction of a chimeric la/lb nucleic acid sequence is 
described in Example 9 and the chimeric la/lb nucleic acid 
sequence is shown in Figures 16A-16F. It is believed that 
the construction of such chimeric nucleic acid sequences 
will be of importance in studying the growth and virulence 
properties of hepatitis C virus and in the production of 
hepatitis C viruses suitable to confer protection against 
multiple genotypes of HCV. For example, one might produce 
a "multivalent" vaccine by putting epitopes from several 
20 genotypes or subtypes into one clone. Alternatively one 
might replace just a single gene from an infectious 
sequence with the corresponding gene from the genomic 
sequence of a strain from another genotype or subtype or 
create a chimeric gene which contains portions of a gene 
from two genotypes or subtypes. Examples of genes which 
could be replaced or which could be made chimeric, 
include, but are not limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
30 the infectious nucleic acid sequences where "mutations" 
includes, but is not limited to, point mutations, 
deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 



25 



WO 99/04008 PCT/US98A4688 



- 17 - 

o 

insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
within the virion. Such mutation could be produced by 
techniques known to those of skill in the art such as 

5 site-directed mutagenesis, fusion PCR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
undertaken to determine sequences that are important for 

10 viral properties such as replication or virulence. For 

example, one may introduce a mutation into the infectious 
nucleic acid sequence which eliminates the cleavage site 
between the NS4A and NS4B polypeptides to examine the 
effects on viral replication and processing of the 
J polypeptide. Alternatively, one or more of the 3 amino 
acids encoded by the infectious lb nucleic acid sequence 
shown in Figures 14A-14F which differ from the HC-J4 
consensus sequence may be back mutated to the 

20 corresponding amino acid in the HC-J4 consensus sequence 

to determine the importance of these three amino acid 

changes to infectivity or virulence. In yet another 

embodiment, one or more of the amino acids from the 

noninfectious lb clones pCV-J4L2S and pCV-J4L4S which 
25 . . 

differ from the consensus sequence may be introduced into 

the infectious lb sequence shown in Figures 14A-14F. 

In yet another example, one may delete all or 
part of a gene or of the 5' or 3' nontranslated region 
30 contained in an infectious nucleic acid sequence and then 
transfect a host cell (animal or cell culture) with the 
mutated sequence and measure viral replication in the host 
by methods known in the art such as RT-PCR. Preferred 
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genes include, but are not limited to, the P7, NS4B and 
NS5A genes. Of course, those of ordinary skill in the art 
will understand that deletion of part of a gene, 
preferably the central portion of the gene, may be 
5 preferable to deletion of the entire gene in order to 

conserve the cleavage site boundaries which exist between 
proteins in the HCV polyprotein and which are necessary 
for proper processing of the polyprotein. 
10 In the alternative, if the transfection is into 

a host animal such as a chimpanzee, one can monitor the 
virulence phenotype of the virus produced by transfection 
of the mutated infectious nucleic acid sequence by methods 
known in the art such as measurement of liver enzyme 
levels (alanine aminotransferase (ALT) or isocitrate 
dehydrogenase (ICD)) or by histopathology of liver 
biopsies. Thus, mutations of the infectious nucleic acid 
sequences may be useful in the production of attenuated 
20 HCV strains suitable for vaccine use. / 

The invention also relates to the use of the 
infectious nucleic acid sequences of the present invention 
to produce attenuated viral strains via passage in vitro 
2^ or is vivo of the virus produced by transfection with the 
infectious nucleic acid sequences. 

The present invention therefore relates to the 
use of the nucleic acid sequences of the invention to 
identify cell lines capable of supporting the replication 
30 of HCV. 

In particular, it is contemplated that the 
mutations of the infectious nucleic acid sequences of the 
invention and the production of chimeric sequences as 
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discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, may 
be useful in identifying cell lines capable of supporting 
HCV replication. 

Transfection of tissue culture cells with the 
nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
electroporation, precipitation with DEAE-Dextran or 
10 calcium phosphate or liposomes. 

In one such embodiment, the method comprises the 
growing of animal cells, especially human cells, in vitro 
and transfecting the cells with the nucleic acid of the 
invention, then determining if the cells show indicia of 

15 

HCV infection. Such indicia include the detection of 
viral antigens in the cell, for example, by 
immunof luorescent procedures well known in the art; the 
detection of viral polypeptides by Western blotting using 

20 antibodies specific therefor; and the detection of newly 

transcribed viral RNA within the cells via methods such as 
RT-PCR. The presence of live, infectious virus particles 
following such tests may also be shown by injection of 
cell culture medium or cell lysates into healthy, 
susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and hepatocyte 

30 cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 



WO 99/04008 PCT/US98/14688 



- 20 - 

O 

chimpanzees. In addition, various immortalization methods 
known to those of ordinary skill in the art can be used to 
obtain cell-lines derived from hepatocyte cultures. For 
example, primary hepatocyte cultures may be fused to a 
^ variety of cells to maintain stability. 

The present invention further relates to the in 
vitro and in vivo production of hepatitis C viruses from 
the nucleic acid sequences of the invention. 

]q In one embodiment, the sequences of the 

invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary skill 
in the art and include, but are not limited to, plasmids, 
vaccinia viruses, retroviruses, adenoviruses and adeno- 
associated viruses. 

In another embodiment, the sequences contained 

20 in the recombinant expression vector can be transcribed in 
vitro by methods known to those of ordinary skill in the 
art in order to produce RNA transcripts which encode the 
hepatitis C viruses of the invention. The hepatitis C 
viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts (see Example 4) or 
with the recombinant expression vectors containing the 

30 nucleic acid sequences described herein. 

The present invention also relates to the 
construction of cassette vectors useful in the cloning of 
viral genomes wherein said vectors comprise a nucleic acid 



35 



WO 99/04008 



PCT/US98/14688 



15 



21 - 



sequence to be cloned, and said vector reading in the 
correct phase for the expression of the viral nucleic acid 
to be cloned. Such a cassette vector will, of course, 
also possess a promoter sequence, advantageously placed 
5 upstream of the sequence to be expressed. Cassette 

vectors according to the present invention are constructed 
according to the procedure described in Figure 1, for 
example, starting with plasmid pCV. Of course, the DNA to 
10 be inserted into said cassette vector can be derived from 
any virus, advantageously from HCV, and most 
advantageously from the H77 strain of HCV. The nucleic 
acid to be inserted according to the present invention 
can, for example, contain one or more open reading frames 
of the virus, for example, HCV. The cassette vectors of 
the present invention may also contain, optionally, one or 
more expressible marker genes for expression as an 
indication of successful transfection and expression of 
20 the nucleic acid sequences of the vector. To insure 

expression, the cassette vectors of the present invention 
will contain a promoter sequence for binding of the 
appropriate cellular RNA polymerase, which will depend on 
the cell into which the vector has been introduced. For 
example, if the host cell is a bacterial cell, then said 
promoter will be a bacterial promoter sequence to which 
the bacterial RNA polymerases will bind. 

The hepatitis C viruses produced from the 
30 sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
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their use as immunogens in the pharmaceutical compositions 
and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 
3 acid sequences of the invention as immunogens in live or 
killed ( e.g. , formalin inactivated) vaccines to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
10 the present invention may be an infectious nucleic acid 
sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a hepatitis 
C virus. Where the sequence is a cDNA sequence, the cDNAs 
and their RNA transcripts may be used to transfect a 
15 mammal by direct injection into the liver tissue of the 
mammal as described in the Examples. 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic expression 
20 vector containing a nucleic acid sequence of the 
invention. 

In yet another embodiment, the immunogen may be 
a polypeptide encoded by the nucleic acid sequences of the 
invention. The present invention therefore also relates 

25 

to polypeptides produced from the nucleic acid sequences 
of the invention or fragments thereof. In one embodiment, 
polypeptides of the present invention can be recombinantly 
produced by synthesis from the nucleic acid sequences of 
30 the invention or isolated fragments thereof, and purified, 
or partially purified, from transfected cells using 
methods already known in the art. In an alternative 
embodiment, the polypeptides may be purified or partially 
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purified from viral particles produced via transfection of 
a host cell with the nucleic acid sequences of the 
invention. Such polypeptides might, for example, include 
either capsid or envelope polypeptides prepared from the 

5 

sequences of the present invention. 

When used as immunogens, the nucleic acid 
sequences of the invention, or the polypeptides or viruses 
produced therefrom, are preferably partially purified 
10 prior to use as immunogens in pharmaceutical compositions 
and vaccines of the present invention. When used as a 
vaccine, the sequences and the polypeptide and virus 
products thereof, can be administered alone or in a 
suitable diluent, including, but not limited to, water, 

15 

saline, or some type of buffered medium. The vaccine 
according to the present invention may be administered to 
an animal, especially a mammal, and most especially a 
human, by a variety of routes, including, but not limited 

20 to, intradermally, intramuscularly, subcutaneous ly , or in 
any combination thereof. 

Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary depending 
on the route selected and the immunogen (nucleic acid, 
virus, polypeptide) administered. One skilled in the art 
will appreciate that the amounts to be administered for 
any particular treatment protocol can be readily 
determined without undue experimentation. The vaccines of 

30 the present invention may be administered once or 

periodically until a suitable titer of anti-HCV antibodies 
appear in the blood. For an immunogen consisting of a 
nucleic acid sequence, a suitable amount of nucleic acid 
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sequence to be used for prophylactic purposes might be 
expected to fall in the range of from about 100 \ig to 
about 5 mg and most preferably in the range of from about 
500 \ig to about 2mg. For a polypeptide, a suitable amount 
to use for prophylactic purposes is preferably 100 ng to 
100 ng and for a virus 10 2 to 10 6 infectious doses. Such 
administration will, of course, occur prior to any sign of 
HCV infection. 

10 A vaccine of the present invention may be 

employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or sterile 
liquid forms such as solutions or suspensions. Any inert 

15 carrier is preferably used, such as saline or phosphate- 
buffered saline, or any such carrier in which the HCV of 
the present invention can be suitably suspended. The 
vaccines may be in the form of single dose preparations or 
in multi-dose flasks which can be utilized for mass- 
vaccination programs of both animals and humans . For 
purposes of using the vaccines of the present invention 
reference is made to Remington ! s Pharmaceutical Sciences, 
Mack Publishing Co., Easton, Pa., Osol (Ed.) (1980); and 

25 New Trends and Developments in Vaccines , Voller et al . 

(Eds.), University Park Press, Baltimore, Md. (1978), both 
of which provide much useful information for preparing and 
using vaccines. Of course, the polypeptides of the 
present invention, when used as vaccines, can include, as 

30 

part of the composition or emulsion, a suitable adjuvant, 
such as alum (or aluminum hydroxide) when humans are to be 
vaccinated, to further stimulate production of antibodies 
by immune cells. When nucleic acids or viruses are used 
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for vaccination purposes, other specific adjuvants such as 
CpG motifs (Krieg, A.K. et al.(1995) and (1996)), may 
prove useful . 

When the nucleic acids, viruses and polypeptides 
of the present invention are used as vaccines or inocula, 
they will normally exist as physically discrete units 
suitable as a unitary dosage for animals, especially 
mammals, and most especially humans, wherein each unit 
will contain a predetermined quantity of active material 
calculated to produce the desired immunogenic effect in 
association with the required diluent. The dose of said 
vaccine or inoculum according to the present invention is 
administered at least once. In order to increase the 
antibody level, a second or booster dose may be 
administered at some time after the initial dose. The 
need for, and timing of, such booster dose will, of 
course, be determined within the sound judgment of the 
20 administrator of such vaccine or inoculum and according to 
sound principles well known in the art. For example, such 
booster dose could reasonably be expected to be 
advantageous at some time between about 2 weeks to about 6 
months following the initial vaccination. Subsequent 
doses may be administered as indicated. 

The nucleic acid sequences, viruses and 
polypeptides of the present invention can also be 
administered for purposes of therapy, where a mammal, 
30 especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 
polypeptides of the present invention are used for such 
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therapeutic purposes, much of the same criteria will apply 
as when it is used as a vaccine, except that inoculation 
will occur post -infection. Thus, when the nucleic acid 
sequences, viruses or polypeptides of the present 
invention are used as therapeutic agents in the treatment 
of infection, the therapeutic agent comprises a 
pharmaceutical composition containing a sufficient amount 
of said nucleic acid sequences, viruses or polypeptides so 

10 as to elicit a therapeutically effective response in the 
organism to be treated. Of course, the amount of 
pharmaceutical composition to be administered will, as for 
vaccines, vary depending on the immunogen contained 
therein (nucleic acid, polypeptide, virus) and on the 
route of administration. 

The therapeutic agent according to the present 
invention can thus be administered by, subcutaneous, 
intramuscular or intradermal routes . One skilled in the 

20 art will certainly appreciate that the amounts to be 

administered for any particular treatment protocol can be 
readily determined without undue experimentation. Of 
course, the actual amounts will vary depending on the 
route of administration as well as the sex, age, and 

25 

clinical status of the subject which, in the case of human 
patients, is to be determined with the sound judgment of 
the clinician. 

The therapeutic agent of the present invention 
30 can be employed in such forms as capsules, liquid 

solutions, suspensions or elixirs, or sterile liquid forms 
such as solutions or suspensions. Any inert carrier is 
preferably used, such as saline, phosphate-buf f ered 
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saline, or any such carrier in which the HCV of the 
present invention can be suitably suspended. The 
therapeutic agents may be in the form of single dose 
preparations or in the mult i -dose flasks which can be 
utilized for mass-treatment programs of both animals and 
humans. Of course, when the nucleic acid sequences, 
viruses or polypeptides of the present invention are used 
as therapeutic agents they may be administered as a single 
10 dose or as a series of doses, depending on the situation 
as determined by the person conducting the treatment. 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term "antibody" 
^ J is herein used to refer to immunoglobulin molecules and 
immunologically active portions of immunoglobulin 
molecules. Examples of antibody molecules are intact 
immunoglobulin molecules, substantially intact 
20 immunoglobulin molecules and portions of an immunoglobulin 
molecule, including those portions known in the art as 
Fab, F(ab') 2 and F(v) as well as chimeric antibody 
molecules . 

Thus, the polypeptides, viruses and nucleic acid 
sequences of the present invention can be used in the 
generation of antibodies that immunoreact (i.e., specific 
binding between an antigenic determinant -containing 
molecule and a molecule containing an antibody combining 
^ site such as a whole antibody molecule or an active 
portion thereof) with antigenic determinants on the 
surface of hepatitis C virus particles. 
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The present invention therefore also relates to 
antibodies produced following immunization with the 
nucleic acid sequences, viruses or polypeptides of the 
present invention. These antibodies are typically 

5 produced by immunizing a mammal with an immunogen or 
vaccine to induce antibody molecules having 
immunospecif icity for polypeptides or viruses produced in 
response to infection with the nucleic acid sequences of 

10 the present invention. When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. Antibodies 

15 produced according to the present invention have the 
unique advantage of being generated in response to 
authentic, functional polypeptides produced according to 
the actual cloned HCV genome. 

20 The antibody molecules of the present invention 

may be polyclonal or monoclonal. Monoclonal antibodies 
are readily produced by methods well known in the art. 
Portions of immunoglobin molecules, such as Fabs, as well 
as chimeric antibodies, may also be produced by methods 

25 

well known to those of ordinary skill in the art of 
generating such antibodies. 

The antibodies according to the present 
invention may also be contained in blood plasma, serum, 
. 30 hybridoma supernatants, and the like. Alternatively, the 
antibody of the present invention is isolated to the 
extent desired by well known techniques such as, for 
example, using DEAE Sephadex. The antibodies produced 
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according to the present invention may be further purified 
so as to obtain specific classes or subclasses of antibody 
such as IgM, IgG, IgA, and the like. Antibodies of the 
IgG class are preferred for purposes of passive 

^ protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases caused 
by hepatitis C virus in animals, especially mammals, and 

JO most especially humans. 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 

15 general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
the recipient mammal with a dosage of antibodies in the 
20 range of from about 1 mg/kg body weight to about 10 mg/kg 
body weight of the mammal, although a lower or higher dose 
may be administered if found desirable. Such antibodies 
will normally be administered by intravenous or 
intramuscular route as an inoculum. The antibodies of the 

25 

present invention are intended to be provided to the 
recipient subject in an amount sufficient to prevent, 
lessen or attenuate the severity, extent or duration of 
any existing infection. 
30 The antibodies prepared by use of the nucleic 

acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic purposes. 
For example, the antibodies can be used as in vitro 
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diagnostic agents to test for the presence of HCV in 
biological samples taken from animals, especially humans. 
Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence, Western blot 
^ analysis and ELISAs. In one such embodiment, the 

biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used to 
detect the presence of HCV to which the antibodies are 
10 bound. 

Such assays may be, for example, a direct 
protocol (where the labeled first antibody is 
immunoreactive with the antigen, such as, for example, a 
polypeptide on the surface of the virus) , an indirect 
protocol (where a labeled second antibody is reactive with 
the first antibody) , a competitive protocol (such as would 
involve the addition of a labeled antigen) , or a sandwich 
protocol (where both labeled and unlabeled antibody are 
20 used) , as well as other protocols well known and described 
in the art. 

In one embodiment, an immunoassay method would 
utilize an antibody specific for HCV envelope determinants 
and would further comprise the steps of contacting a 
biological sample with the HCV-specific antibody and then 
detecting the presence of HCV material in the test sample 
using one of the types of assay protocols as described 
above. Polypeptides and antibodies produced according to 
30 the present invention may also be supplied in the form of 
a kit, either present in vials as purified material, or 
present in compositions and suspended in suitable diluents 
as previously described. 
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In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
comprises -in combination a series of containers, each 
container a reagent needed for such assay. Thus, one such 

5 container would contain a specific amount of HCV-specific 
antibody as already described, a second container would 
contain a diluent for suspension of the sample to be 
tested, a third container would contain a positive control 

10 and an additional container would contain a negative 

control. An additional container could contain a blank. 

For all prophylactic, therapeutic and diagnostic 
uses, the antibodies of the invention and other reagents, 
plus appropriate devices and accessories, may be provided 

ly in the form of a kit so as to facilitate ready 
availability and ease of use. 

The present invention also relates to the use of 
nucleic acid sequences and polypeptides of the present 

20 invention to screen potential antiviral agents for 

antiviral activity against HCV. Such screening methods 
are known by those of skill in the art. Generally, the 
antiviral agents are tested at a variety of 
concentrations, for their effect on preventing viral 

25 

replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity or 
of viral pathogenicity (and a low level of toxicity) in an 
animal model system. 

30 

In one embodiment, animal cells (especially 
human cells) transfected with the nucleic acid sequences 
of the invention are cultured in vitro and the cells are 
treated with a candidate antiviral agent (a chemical, 
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peptide etc.) for antiviral activity by adding the 
candidate agent to the medium. The treated cells are then 
exposed, possibly under transfecting or fusing conditions 
known in the art, to the nucleic acid sequences of the 
3 present invention. A sufficient period of time would then 
be allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods known 
10 to those of ordinary skill in the art . Such methods 

include, but are not limited to, the detection of viral 
antigens in the cell/ for example, by immunof luorescent 
procedures well known in the art; the detection of viral 
polypeptides by Western blotting using antibodies specific 
therefor; the detection of newly transcribed viral RNA 
within the cells by RT-PCR; and the detection of the 
presence of live, infectious virus particles by injection 
of cell culture medium or cell lysates into healthy, 
20 susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. A comparison of results 
obtained for control cells (treated only with nucleic acid 
sequence) with those obtained for treated cells (nucleic 
acid sequence and antiviral agent) would indicate, the 
degree, if any, of antiviral activity of the candidate 
antiviral agent. Of course, one of ordinary skill in the 
art would readily understand that such cells can be 
treated with the candidate antiviral agent either before 
30 or after exposure to the nucleic acid sequence of the 
present invention so as to determine what stage, or 
stages, of viral infection and replication said agent is 
effective against. 
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In an alternative embodiment, a protease such as 
NS3 protease produced from a nucleic acid sequence of the 
invention may be used to screen for protease inhibitors 
which may act as antiviral agents. The structural and 

5 nonstructural regions of the HCV genome, including 

nucleotide and amino acid locations, have been determined, 
for example, as depicted in Houghton, M. (1996), Fig. 1; 
and Major, M.E. et al- (1997), Table 1. 

jq Such above-mentioned protease inhibitors may 

take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may be 
screened using methods known to those of skill in the art 
(Houghton, M. (1996) and Major, M.E. et al. (1997)). For 

15 example, a substrate may be employed which mimics the 
protease 1 s natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 

20 with the protease and the candidate protease inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 
inhibitor are then determined. 

25 

In yet another embodiment, a candidate antiviral 
agent (such as a protease inhibitor) may be directly 
assayed in vivo for antiviral activity by administering 
the candidate antiviral agent to a chimpanzee transfected 
30 with a nucleic acid sequence of the invention and then 
measux ing viral replication in vivo via methods such as 
RT-PCR. Of course, the chimpanzee may be treated with the 
candidate agent either before or after transfection with 
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the infectious nucleic acid sequence so as to determine 
what stage, or stages, of viral infection and replication 
the agent is effective against. 

The invention also provides that the nucleic 
acid sequences, viruses and polypeptides of the invention 
may be supplied in the form of a kit, alone or in the form 
of a pharmaceutical composition. 

All scientific publication and/or patents cited 
10 herein are specifically incorporated by reference. The 
following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof . 

15 EXAMPLES 

MATERIALS AND METHODS 
For Examples 1-4 



20 



Collection of Virus 



Hepatitis C virus was collected and used as a 
source for the RNA used in generating the cDNA clones 
according to the present invention. Plasma containing 
strain H77 of HCV was obtained from a patient in the acute 
25 phase of transfusion-associated non-A, non-B hepatitis 

(Feinstone et.al (1981)). Strain H77 belongs to genotype 
la of HCV (Ogata et al (1991), Inchauspe et al (1991)). 
The consensus sequence for most of its genome has been 
determined (Kolyakov et al (1996), Ogata et al (1991), 
Inchauspe et al (1991) and Farci et al (1996)). 



30 
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RNA Purification 

Viral RNA was collected and purified by 
conventional means. In general, total RNA from 10 /xl of 

5 H77 plasma was extracted with the TRIzol system (GIBCO 

BRL) . The RNA pellet was resuspended in 100 /zl of 10 mM 
dithiothreitol (DTT) with 5% (vol/vol) RNasin (20 - 40 
units//xl) (available from Promega) and 10 fil aliquots were 
stored at -80°C. In subsequent experiments RT-PCR was 
performed on RNA equivalent to 1 /xl of H77 plasma, which 
contained an estimated 10 5 genome equivalents (GE) of HCV 
(Yanagi et al (1996) ) . 

Primers used in the RT-PCR process were deduced 

15 from the genomic sequences of strain H77 according to 

procedures already known in the art (see above) or else 
were determined specifically for use herein. The primers 
generated for this purpose are listed in Table 1. 
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Table 1. Oligonucleotides used for PCR 
amplification of strain H77 of HCV 



Designation Sequence (5' -* 3') 



H9261F 

H3'X58R 

H9282F 

H3'X45R 

H9375F 

H3'X-35R 

H9386F 

H3'X-38R 

HI 

Al 

H9417R 



GGCTACAGCGGGGGGAGACATTTATCACAGC 
TCATGCGGCTCACGGACCTTTCACAGCTAG 
GTC CAAGCTTA TCACAGCGTGTCTCATGCCCGGCCCCG 
CGTC TCTAGAG GACCTTTCACAGCTAGCCGTGACTAGGG 
TGAAGGTT.GGGGTAAACACTCCGGCCTCTTAGGCCATT 
ACATGATCTGCAGAGAGGCCAGTATCAGCACTCTC 
GTCC ^GCTTACGCGT AAACACTCCGGCCTCCTTAAGCCATTCCTG 
CGTCTCTAGACATGATCTGCAGAGAGGCCAGTATCAGCACTCTCTGC 
TTTTTTTTGCGGCCGCrAATACGACTCACTATAGCCAGCCCCCTGAT- 
GGGGGCGACACTCCACCATG 
ACTGTCTTCACGCAGAAAGCGTCTAGCCAT 
CGTCTCTAGACAGGAAATGGCTTAAGAGGCCGGAGTGTTTACC 



* HCV sequences are shown m plain text, ncm-HCV- specif ic 
sequences are shown in boldface and artificial cleavage sites 
used for cDNA cloning are underlined. The core sequenceof the 

T7 promoter in primer HI is shown in italics. 

1* Primers for long RT-PCR were size-purified. 

cDNA Synthesis 

The RNA was denatured at 65°C for 2 min, and 
cDNA synthesis was performed in a 20 /il reaction volume 
with Superscript II reverse transcriptase (from GIBCO/BRL) 
at 42 °C for 1 hour using specific antisense primers as 
described previously (Tellier et al (1996)). The cDNA 
mixture was treated with RNase H and RNase Tl (GIBCO/BRL) 
for 20 min at 37 °C. 

Amplification and Cloning of the 3' UTR 

The 3' UTR of strain H77 was amplified by PCR 
in two different assays. In both of these nested PCR 
reactions the first round of PCR was performed in a total 
volume of 50 pi in 1 x buffer, 250 pmol of each 
deoxynucleoside triphosphate (dNTP; Pharmacia), 20 pmol 
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each of external sense and antisense primers, 1 /xl of the 
Advantage KlenTaq polymerase mix (from Clontech) and 2 fil 
of the final cDNA reaction mixture. In the second round 
of PCR, 5 /xl of the first round PCR mixture was added to 
5 45 /xl of PCR mixture prepared as already described. Each 
round of PCR (35 cycles), which was performed in a Perkin 
Elmer DNA thermal cycler 480, consisted of denaturation at 
94 °C for 1 min (in 1st cycle 1 min 30 sec) , annealing at 
10 60°C for 1 min and elongation at 68°C for 2 min. In one 
experiment a region from NS5B to the conserved region of 
the 3' UTR was amplified with the external primers H9261F 
and H3'X58R, and the internal primers H9282F and H3'X45R 
15 (Table 1) . In another experiment, a segment of the 
variable region to the very end of the 3' UTR was 
amplified with the external primers H9375F and H3'X-35R, 
and the internal primers H9386F and H3'X-38R (Table 1, 
Fig. 1). Amplified products were purified with QIAquick 

20 

PCR purification kit (from QIAGEN) , digested with Hind III 
and Xba I (from Promega) , purified by either gel 
electrophoresis or phenol/chloroform extraction, and then 
cloned into the multiple cloning site of plasmid pGEM- 
25 9zf(-) (Promega) or pUC19 (Pharmacia). Cloning of cDNA 

into the vector was performed with T4 DNA ligase (Promega) 
by standard procedures. 

Amplification of Near Full-Lencrth H77 Genomes bv Long PCR 

30 

The reactions were performed in a total volume 
of 50 fxl in 1 x buffer, 250 /xmol of each dNTP, 10 pmol 
each of sense and antisense primers, 1 jil of the Advantage 
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KlenTaq polymerase mix and 2 /xl of the cDNA reaction 
mixture (Tellier et al (1996)) . A single PCR round of 35 
cycles was performed in a Robocycler thermal cycler (from 
Stratagene) , and consisted of denaturation at 99 °C for 35 

5 

sec, annealing at 67 °C for 30 sec and elongation at 68 °C 
for 10 min during the first 5 cycles, 11 min during the 
next 10 cycles, 12 min during the following 10 cycles and 
13 min during the last 10 cycles. To amplify the complete 
10 ORF of HCV by long RT-PCR we used the sense primers HI or 
Al deduced from the 5' UTR and the antisense primer H9417R 
deduced from the variable region of the 3' UTR (Table 1, 
Fig. 1) . 

Construction of Full-Length H 77 cDNA Clones 

The long PCR products amplified with HI and 
H9417R primers were cloned directly into pGEM-9zf (-) after 
2Q digestion with Not I and Xba I (from Promega) (as per 
Fig. 1) . Two clones were obtained with inserts of the 
expected size, pH21 x and pH50 r . Next, the chosen 3' UTR 
was cloned into both pH21 r and pH50j after digestion with 
Afl II and Xba I (New England Biolabs) . DH5oc competent 

25 

cells (GIBCO/BRL) were transformed and selected with LB 
agar plates containing 100 fig/ml ampicillin (from SIGMA) . 
Then the selected colonies were cultured in LB liquid 
containing ampicillin at 30°C for -18-20 hrs 
30 (transformants containing full-length or near full-length 
cDNA of H77 produced a very low yield of plasmid when 
cultured at 37 °C or for more than 24 hrs) . After small 
scale preparation (Wizard Plus Minipreps DNA Purification 
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Systems, Promega) each plasmid was re transformed to select 
a single clone, and large scale preparation of plasmid DNA 
was performed with a QIAGEN plasmid Maxi kit. 

5 Cloning of Long RT-PCR Products Into a Cassette Vector 

To improve the efficiency of cloning, a vector 
with consensus 5' and 3' termini of HCV strain H77 was 
constructed (Fig. 1) . This cassette vector (pCV) was 
obtained by cutting out the BairiRI fragment (nts 13 58 - 
753 0 of the H77 genome) from pH50, followed by religation. 
Next, the long PCR products of H77 amplified with HI and 
H9417R or Al and H9417R primers were purified (Geneclean 
spin kit; BIO 101) and cloned into pCV after digestion 
with Age I and Afl II (New England Biolabs) or with Pin AI 
(isoschizomer of Age I) and Bfr I (isoschizomer of Afl II) 
(Boehringer Mannheim) . Large scale preparations of the 
plasmids containing full-length cDNA of H77 were performed 
as described above. 



15 



20 



Construction of H77 Consensus Chimeric cDNA Clone 

A full-length cDNA clone of H77 with an ORF 
25 encoding the consensus amino acid sequence was constructed 
by making a chimera from four of the cDNA clones obtained 
above. This consensus chimera, pCV-H77C, was constructed 
in two ligation steps by using standard molecular 
procedures and convenient cleavage sites and involved 
first a two piece ligation and then a three piece 
ligation. Large scale preparation of pCV-H77C was 
performed as already described. 
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In Vitro Tra nscription 

Plasmids containing the full-length HCV cDNA 
were linearized with Xba I (from Promega) , and purified by 

5 phenol/chloroform extraction and ethanol precipitation. A 
100 /xl reaction mixture containing 10 fig of linearized 
plasmid DNA, 1 x transcription buffer, 1 mM ATP, CTP, GTP 
and UTP, lOmM DTT , 4% (v/v) RNasin (20-40 units/^1) and 2 
ul of T7 RNA polymerase (Promega) was incubated at 37 °C 
for 2 hrs. Five /xl of the reaction mxxture was analyzed 
by agarose gel electrophoresis followed by ethidium 
bromide staining. The transcription reaction mixture was 
diluted with 400 /xl of ice-cold phosphate-buf f ered saline 

15 without calcium or magnesium, immediately frozen on dry 

ice and stored at -80 °C. The final nucleic acid mixture 
was injected into chimpanzees within 24 hrs. 

Intrahepatic Transfection of Ch impanzees 

20 

Laparotomy was performed and aliquots from two 

transcription reactions were injected into 6 sites of the 

exposed liver (Emerson et al (1992) . Serum samples were 

collected weekly from chimpanzees and monitored for liver 
25 ~ n 

enzyme levels and anti-HCV antibodies. Weekly samples of 

100 fil of serum were tested for HCV RNA in a highly 

sensitive nested RT-PCR assay with AmpliTaq Gold (Perkin 

Elmer) (Yanagi et al (1996); Bukh et al (1992)). The 

30 genome titer of HCV was estimated by testing 10 -fold 

serial dilutions of the extracted RNA in the RT-PCR assay 

(Yanagi et al (1996)). The two chimpanzees used in this 
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study were maintained under conditions that met all 
requirements for their use in an approved facility. 

The consensus sequence of the complete ORF from 
HCV genomes recovered at week 2 post inoculation (p.i) 

5 was determined by direct sequencing of PGR products 
obtained in long RT-PCR with primers Al and H9417R 
followed by nested PCR of 10 overlapping fragments. The 
consensus sequence of the variable region of the 3' UTR 

10 was determined by direct sequencing of an amplicon 

obtained in nested RT-PCR as described above. Finally, we 
amplified selected regions independently by nested RT-PCR 
with AmpliTaq Gold. 

15 Sequence Analysis 

Both strands of DNA from PCR products, as well 
as plasmids, were sequenced with the ABI PRISM Dye 
Termination Cycle Sequencing Ready Reaction Kit using Taq 

20 DNA polymerase (Perkin Elmer) and about 100 specific sense 
and antisense sequence primers. 

The consensus sequence of HCV strain H77 was 
determined in two different ways. In one approach, 

25 overlapping PCR products were directly sequenced, and 
amplified in nested RT-PCR from the H77 plasma sample. 
The sequence analyzed (nucleotides (nts) 35-9417) included 
the entire genome except the very 5' and 3' termini. In 
the second approach, the consensus sequence of nts 157- 

30 9384 was deduced from the sequences of 18 full-length cDNA 
clones . 
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EXAMPLE 1 

Variability in the sequence of the 3' UTR of HCV strain 
H77 

5 The heterogeneity of the 3' UTR was analyzed by 

cloning and sequencing of DNA amplicons obtained in nested 
RT-PCR. 19 clones containing sequences of the entire 
variable region, the poly U-UC region and the adjacent 19 
nt of the conserved region, and 65 clones containing 
sequences of the entire poly U-UC region and the first 63 
nts of the conserved region were analyzed. This analysis 
confirmed that the variable region consisted of 43 nts, 
including two conserved termination codons (Han et al 

15 (1992)). The sequence of the variable region was highly 
conserved within H77 since only 3 point mutations were 
found among the 19 clones analyzed. A poly U-UC region 
was present in all 84 clones analyzed. However, its 

2 0 length varied from 71-141 nts. The length of the poly U 
region was 9-103 nts, and that of the poly UC region was 
35-85 nts. The number of C residues increased towards the 
3' end of the poly UC region but the sequence of this 
region is not conserved. The first 63 nts of the 

25 

conserved region were highly conserved among the clones 
analyzed, with a total of only 14 point mutations. To 
confirm the validity of the analysis, the 3' UTR was 
reamplified directly from a full-length cDNA clone of HCV 
30 (see below) by the nested-PCR procedure with the primers 
in the variable region and at the very 3' end of the HCV 
genome and cloned the PCR product. Eight clones had 1-7 
nt deletions in the poly U region. Furthermore, although 
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the C residues of the poly UC region were maintained, the 
spacing of these varied because of 1-2 nt deletions of U 
residues. These deletions must be artifacts introduced by 
PCR and such mistakes may have contributed to the 
5 heterogeneity originally observed in this region. . 

However, the conserved region of the 3' UTR was amplified 
correctly, suggesting that the deletions were due to 
difficulties in transcribing a highly repetitive sequence. 
10 one of the 3' UTR clones was selected for 

engineering of full-length cDNA clones of H77. This clone 
had the consensus variable sequence except for a single 
point mutation introduced to create an Afl II cleavage 
site, a poly U-UC stretch of 81 nts with the most commonly 
observed UC pattern and the consensus sequence of the 
complete conserved region of 101 nts, including the distal 
38 nts which originated from the antisense primer used in 
the amplification . After linearization with Xba I, the 
DNA template of this clone had the authentic 3' end. 

EXAMPLE 2 



15 



20 



25 



30 



The Entire Open Reading Frame of H77 
am plified in One Round o f Long RT-PCR 

It had been previously demonstrated that a 9.25 
kb fragment of the HCV genome from the 5' UTR to the 3' 
end of NS5B could be amplified from 10 4 GE (genome 
equivalents) of H77 by a single round of long RT-PCR 
(Tellier et al (1996a)). In the current study, by 
optimizing primers and cycling conditions, the entire ORF 
of H77 was amplified in a single round of long RT-PCR with 
primers from the 5' UTR and the variable region of the 3' 
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UTR. In fact, 9.4 kb of the H77 genome (H product : from 
the very 5' end to the variable region of the 3' UTR) could 
be amplified from 10 5 GE or 9.3 kb (A product: from within 
the 5' UTR to the variable region of the 3' UTR) from 10 4 
GE or 10 5 GE, in a single round of long RT-PCR (Fig. 2) . 
The PCR products amplified from 10 5 GE of H77 were used for 
engineering full-length cDNA clones (see below) . 

EXAMPLE 3 

Construction of Multiple Full -Length 
cDNA Clones of H77 in a Single Step by 
Cloning of Long RT-PCR Amplicons Directly 
into a Cassette Vector with Fixed 5' and 3' Termini 



Direct cloning of the long PCR products (H) , 
which contained a 5' T7 promoter, the authentic 5' end, the 
entire ORF of H77 and a short region of the 3' UTR, into 
pGEM-9zf (-) vector by Not I and Xba I digestion was first 
attempted. However, among the 70 clones examined all but 
two had inserts that were shorter than predicted. Sequence 
analysis identified a second Not I site in the majority of 
clones, which resulted in deletion of the nts past 
25 position 9221. Only two clones (pH21j and pH50j) were 

missing the second Not I site and had the expected 5' and 
3' sequences of the PCR product. Therefore, full-length 
cDNA clones (pH21 and pH50) were constructed by inserting 
the chosen 3' UTR into pH21 x and pH50j, respectively. 
Sequence analysis revealed that clone pH21 had a complete 
full-length sequence of H77; this clone was tested for 
infect ivity. The second clone, pH50, had one nt deletion 
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in the ORF at position 6365; this clone was used to make a 
cassette vector. 

The complete ORF was amplified by constructing a 

cassette vector with fixed 5' and 3' termini as an 
intermediate of the full-length cDNA clones. This vector 
(pCV) was constructed by digestion of clone pH50 with 
BamHI, followed by religation, to give a shortened plasmid 
readily distinguished from plasmids containing the full- 
10 length insert. Attempts to clone long RT-PCR products (H) 
into pCV by Age I and Afl II yielded only 1 of 23 clones 
with an insert of the expected size. In order to increase 
the efficiency of cloning, we repeated the procedure but 
used Pin A I and Bfr I instead of the respective 
isoschizomers Age I and Afl II. By this protocol, 24 of 
31 H clones and 30 of 35 A clones had the full-length cDNA 
of H77 as evaluated by restriction enzyme digestion. A 
total of 16 clones, selected at random, were each 
20 retransformed, and individual plasmids were purified and 
completely sequenced. 

EXAMPLE 4 

25 Demonstration of Infectious Nature 

of Transcripts of a cDNA Clone 
Representing the Consensus Sequence of Strain H77 



15 



. A consensus chimera was constructed from 4 of 
the full-length cDNA clones with just 2 ligation steps. 
The final construct, pCV-H77C, had 11 nt differences from 
the consensus sequence of H77 in the ORF (Fig. 3) . 
However, 10 of these nucleotide differences represented 
silent mutations. The chimeric clone differed from the 
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consensus sequence at only one amino acid [L instead of F 
at position 790] . Among the 18 ORFs analyzed above, the F 
residue was found in 11 clones and the L residue in 7 
clones. However, the L residue was dominant in other 

^ isolates of genotype la, including a first passage of H77 
in a chimpanzee (Inchauspe et al (1991)). 

To test the inf ectivity of the consensus 
chimeric clone of H77 intrahepatic transfection of a 

10 chimpanzee was performed. The pCV-H77C clone was 

linearized with Xba I and transcribed in vitro by T7 RNA 
polymerase (Fig. 2) . The transcription mixture was next 
injected into 6 sites of the liver of chimpanzee 1530. 
The chimpanzee became infected with HCV as measured by 

15 detection of 10 2 GE/ml of viral genome at week 1 p.i. 

Furthermore, the HCV titer increased to 10 4 GE/ml at week 
2 p.i., and reached 10 6 GE/ml by week 8 p.i. The viremic 
pattern observed in the early phase of the infection with 

20 the recombinant virus was similar to that observed in 

chimpanzees inoculated intravenously with strain H77 or 
other strains of HCV (Shimizu (1990)). 

The sequence of the HCV genomes from the serum 

25 sample collected at week 2 p.i. was analyzed. The 
consensus sequence of nts 298-9375 of the recovered 
genomes was determined by direct sequencing of PCR 
products obtained in long RT-PCR followed by nested PCR of 
10 overlapping fragments. The identity to clone pCV-H77C 

30 sequence was 100%. The consensus sequence of nts 96- 
291,1328-1848, 3585-4106, 4763-5113 and 9322-9445 was 
determined from PCR products obtained in different nested 
RT-PCR assays. The identity of these sequences with pCV- 
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H77C was also 100%. These latter regions contained 4 
mutations unique to the consensus chimera, including the 
artificial Afl II cleavage site in the 3' UTR. Therefore, 
RNA transcripts of this clone of HCV were infectious. 

The infectious nature of the consensus chimera 
indicates that the regions of the 5' and 3' UTRs 
incorporated into the cassette vector do not destroy 
viability. This makes it highly advantageous to use the 
10 cassette vector to construct infectious cDNA clones of 

other HCV strains when the consensus sequence for each ORF 
is inserted. 

In addition, two complete full-length clones 
^ (dubbed pH21 and pCV-Hll) constructed were not infectious, 
as shown by intrahepatic injection of chimpanzees with the 
corresponding RNA transcripts. Thus, injection of the 
transcription mixture into 3 sites of the exposed liver 
resulted in no observable HCV replication and weekly serum 
20 samples were negative for HCV RNA at weeks 1 - 17 p. i. in 
a highly sensitive nested RT-PCR assay. The cDNA template 
injected along with the RNA transcripts was also not 
detected in this assay. 

Moreover, the chimpanzee remained negative for 
antibodies to. HCV throughout the follow-up. Subsequent 
sequence analysis revealed that 7 of 16 additional clones 
were defective for polyprotein synthesis and all clones 
had multiple amino acid mutations compared with the 
consensus sequence of the parent strain. For example, 
clone pH21, which was not infectious, had 7 amino acid 
substitutions in the entire predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The most 
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notable mutation was at position 1026, which changed L to 
Q, altering the cleavage site between NS2 and NS3 (Reed 
(1995)). Clone pCV-Hll, also non-infectious, had 21 amino 
acid substitutions in the predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3).. The amino 
acid mutation at position 564 eliminated a highly 
conserved C residue in the E2 protein (Okamoto (1992a)). 

EXAMPLE 4A 

The chimpanzee of Example 4, designated 153 0, 
was monitored out to 32 weeks p.i. for serum enzyme levels 
(ALT) and the presence of anti-HCV antibodies, HCV RNA, 
and liver histopathology . The results are shown in Figure 
15 18B. 

A second chimp, designated 14 94, was also 
transfected with RNA transcripts of the pCV-H77C clone and 
monitored out to 17 weeks p.i. for the presence of anti- 
HCV antibodies, HCV RNA and elevated serum enzyme levels. 
The results are shown in Figure 18A. 

MATERIALS AND METHODS 
for Examples 5-10 



20 



25 



30 



Source Of HCV Genotype lb 

An infectious plasma pool (second chimpanzee 
passage) containing strain HC-J4, genotype lb, was 
prepared from acute phase plasma of a chimpanzee 
experimentally infected with serum containing HC-J4/91 
(Okamoto et al., 1992b). The HC-J4/91 sample was obtained 
from a first chimpanzee passage during the chronic phase 
of hepatitis C about 8 years after experimental infection. 
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The consensus sequence of the entire genome, except for 
the very 3 f end, was determined previously for HC-J4/91 
(Okamoto et al., 1992b). 

5 Preparation Of HCV RNA 

Viral RNA was extracted from 100 /xl aliquots of 
the HC-J4 plasma pool with the TRIzol system (GIBCO BRL) . 
The RNA pellets were each resuspended in 10 /xl of 10 mM 
10 dithiothreitol (DTT) with 5% (vol/vol) RNasin (20-40 

units//xl) (Promega) and stored at -80°C or immediately 
used for cDNA synthesis. 

Amplification And Cloning Of The 3' UTR 

15 

A region spanning from NS5B to the conserved 
region of the 3' UTR was amplified in nested RT-PCR using 
the procedure of Yanagi et al . , (1997). 

20 In brief, the RNA was denatured at 65°C for 2 

minutes, and cDNA was synthesized at 42°C for 1 hour with 
Superscript II reverse transcriptase (GIBCO BRL) and 
primer H3'X58R (Table 1) in a 20 fil reaction volume. The 

25 cDNA mixture was treated with RNase H and RNase Tl (GIBCO 
BRL) at 37°C for 20 minutes. The first round of PCR was 
performed on 2 /xl of the final cDNA mixture in a total 
volume of 50 fil with the Advantage cDNA polymerase mix 
(Clontech) and external primers H9261F (Table 1) and 

30 

H3'X58R (Table 1) . In the second round of PCR [internal 
primers H9282F (Table 1) and H3'X45R (Table 1)], 5 /xl of 
the first round PCR mixture was added to 45 /il of the PCR 
reaction mixture. Each round of PCR (35 cycles), was 

35 
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performed in a DNA thermal cycler 480 (Perkin Elmer) and 
consisted of denaturation at 94°C for 1 minute (1st cycle: 
1 minute 3 0 sec) , annealing at 60°C for 1 minute and 
^ elongation at 68°C for 2 minutes. After purification with 
QIAquick PCR purification kit (QIAGEN) , digestion with 
Hindi I I and Xbal (Promega) , and phenol /chloroform 
extraction, the amplified products were cloned into 
pGEM-9zf (-) (Promega) (Yanagi et al., 1997). 

10 

Amplification And Cloning Of The Entire ORF 

A region from within the 5' UTR to the variable 
region of the 3' UTR of strain HC-J4 was amplified by long 

15 RT-PCR (Fig. 1) (Yanagi et al., 1997). The cDNA was 

synthesized at 42°C for 1 hour in a 20 /il reaction volume 
with Superscript II reverse transcriptase and primer J4- 
9405R ( 5' -GCCTATTGGCCTGGAGTGGTTAGCTC- 3' ) , and treated with 

20 RNases as above. The cDNA mixture (2 fil) was amplified by 
long PCR with the Advantage cDNA polymerase mix and 
primers Al (Table 1) (Bukh et al., 1992; Yanagi et al., 
1997) and J4-9398R (5'- 

25 AGGATGG CCTTAAG GCCTGGAGTGGTTAGCTCCCCGTTCA- 3' ) . Primer J4- 

9398R contained extra bases {bold) and an artificial Aflll 
cleavage site (underlined) . A single PCR round was 
performed in a Robocycler thermal cycler (Stratagene) , and 
consisted of denaturation at 99°C for 35 seconds, 

30 

annealing at 67°C for 30 seconds and elongation at 68°C 
for 10 minutes during the first 5 cycles, 11 minutes 
during the next 10 cycles, 12 minutes during the following 
10 cycles and 13 minutes during the last 10 cycles. 
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After digesting the long PCR products obtained 
from strain HC-J4 with PinAI (isoschizomer of Agel) and 
Bfrl (isoschizomer of Aflll) (Boehringer Mannheim), 
attempts were made to clone them directly into a cassette 

5 

vector (pCV) , which contained the 5' and 3' termini of 
strain H77 (Figure 1) but no full-length clones were 
obtained. Accordingly, to improve the efficiency of 
cloning, the PCR product was further digested with Bglll 

10 (Boehringer Mannheim) and the two resultant genome 

fragments [L fragment: Pirihl / Bglll , nts 156 - 8935; S 
fragment: Bglll/Brfl , nts 8936 - 9398] were separately 
cloned into pCV (Figure 6) . 

j 5 DH5a competent cells (GIBCO BRL) were 

transformed and selected on LB agar plates containing 100 
/xg/ml ampicillin (SIGMA) and amplified in LB liquid 
cultures at 30°C for 18-20 hours. 

Sequence analysis of 9 plasmids containing the S 

20 

fragment (miniprep samples) and 9 plasmids containing the 
L fragment (maxiprep samples) were performed as described 
previously (Yanagi et al . , 1997). Three L fragments, each 
encoding a distinct polypeptide, were cloned into pCV-J4S9 
25 (which contained an S fragment encoding the consensus 

amino acid sequence of HC-J4) to construct three chimeric 
full-length HCV cDNAs (pCV-J4L2S, pCV-J4L4S and pCV-J4L6S) 
(Fig. 6) . Large scale preparation of each clone was 
performed as described previously with a QIAGEN plasmid 

30 

Maxi kit (Yanagi et al., 1997) and the authenticity of 
each clone was confirmed by sequence analysis. 
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Sequence Analysis 

Both strands of DNA were sequenced with the ABI 
PRISM Dye Termination Cycle Sequencing Ready Reaction Kit 
5 using Taq DNA polymerase (Perkin Elmer) and about 90 

specific sense and antisense primers. Analyses of genomic 
sequences, including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford Molecular 
Group) (Bukh et al . , 1995). 

10 

The consensus sequence of strain HC-J4 was 
determined by direct sequencing of PCR products (nts 11 - 
9412) and by sequence analysis of multiple cloned L and S 
fragments (nts 156 -9371). The consensus sequence of the 
15 3' UTR (3' variable region, polypyrimidine tract and the 
first 16 nucleotides of the conserved region) was 
determined by analysis of 24 cDNA clones. 

Intrahepatic Transfection Of A Chimpanzee 
20 With Transcribed RNA 

Two in vitro transcription reactions were 
performed with each of the three full-length clones. In 
each reaction 10 /xg of plasmid DNA linearized with Xba I 

25 

(Promega) was transcribed in a 100 jxl reaction volume with 
T7 RNA polymerase (Promega) at 37°C for 2 hours as 
described previously (Yanagi et al . , 1997). Five fil of 
the final reaction mixture was analyzed by agarose gel 
30 electrophoresis and ethidium bromide staining (Fig. 5) . 
Each transcription mixture was diluted with 400 ptl of 
ice-cold phosphate-buf f ered saline without calcium or 
magnesium and then the two aliquots from the same cDNA 
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clone were combined, immediately frozen on dry ice and 
stored at -80°C. Within 24 hours after freezing the 
transcription mixtures were injected into the chimpanzee 
by percutaneous intrahepatic injection that was guided by 
ultrasound. Each inoculum was individually injected (5-6 
sites) into a separate area of the liver to prevent 
complementation or recombination. The chimpanzee was 
maintained under conditions that met all requirements for 
10 its use in an approved facility. 

Serum samples were collected weekly from the 
chimpanzee and monitored for liver enzyme levels and 
anti-HCV antibodies. Weekly samples of 100 /xl of serum 
were tested for HCV RNA in a sensitive nested RT-PCR assay 

15 

(Bukh et al. f 1992, Yanagi et al . , 1996) with AmpliTaq 
Gold DNA polymerase. The genome equivalent (GE) titer of 
HCV was determined by testing 10 -fold serial dilutions of 
the extracted RNA in the RT-PCR assay (Yanagi et al., 

20 1996) with 1 GE defined as the number of HCV genomes 

present in the highest dilution which was positive in the 
RT-nested PCR assay. 

To identify which of the three clones was 

25 infectious in vivo , the NS3 region (nts 3659 - 4110) from 
the chimpanzee serum was amplified in a highly sensitive 
and specific nested RT-PCR assay with AmpliTaq Gold DNA 
polymerase and the PCR products were cloned with a TA 
cloning kit (Invitrogen) . In addition, the consensus 

^ sequence of the nearly complete genome (nts 11 - 9441) was 
determined by direct sequencing of overlapping PCR 
products . 
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EXAMPLE 5 

Sequence Analysis Of Infectious Plasma Pool 
Of Strain HC-J4 Used As The Cloning Source 

5 

As an infectious cDNA clone of a genotype la 
strain of HCV had been obtained only after the ORF was 
engineered to encode the consensus polypeptide (Kolykhalov 
et al., 1997; Yanagi et al . , 1997), a detailed sequence 

10 analysis of the cloning source was performed to determine 
the consensus sequence prior to constructing an infectious 
cDNA clone of a lb genotype. 

A plasma pool of strain HC-J4 was prepared from 

^ acute phase plasmapheresis units collected from a 

chimpanzee experimentally infected with HC-J4/91 (Okamoto 
et al., 1992b) . This HCV pool had a PCR titer of 10 4 - 
10 5 GE/ml and an infectivity titer of approximately 10 3 
chimpanzee infectious doses per ml. 

20 The heterogeneity of the 3' UTR of strain HC-J4 

was determined by analyzing 24 clones of nested RT-PCR 
product. The consensus sequence was identical to that 
previously published for HC-J4/91 (Okamoto et al . , 1992b) , 

25 except at position 9407 (see below) . The variable region 
consisted of 41 nucleotides (nts. 9372 - 9412), including 
two in-frame termination codons. Furthermore, its 
sequence was highly conserved except at positions 9399 (19 
A and 5 T clones) and 9407 (17 T and 7 A clones) . The 

30 

poly U-UC region varied slightly in composition and 
greatly in length (31-162 nucleotides) . In the conserved 
region, the first 16 nucleotides of 22 clones were 
identical to those previously published for other genotype 
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1 strains, whereas two clones each had a single point 
mutation. These data suggested that the structural 
organization at the 3' end of HC-J4 was similar to that of 
the infectious clone of a genotype la strain of Yanagi et 

5 

al (1997) . 

Next, the entire ORF of HC-J4 was amplified in a 
single round of long RT-PCR (Figure 5) . The original plan 
was to clone the resulting PCR products into the PinAI and 

10 Brfl site of a HCV cassette vector (pCV) , which had fixed 
5' and 3' termini of genotype la (Yanagi et al . , 1997) but 
since full-length clones were not obtained, two genome 
fragments (L and S) derived from the long RT-PCR products 
(Figure 6) were separately subcloned into pCV. 

To determine the consensus sequence of the ORF, 
the sequence of 9 clones each of the L fragment (pCV-J4L) 
and of the S fragment (pCV-J4S) was determined and 
quasispecies were found at 275 nucleotide (3.05 %) and 78 

20 amino acid (2.59 %) positions, scattered throughout the 
9030 nts (3010 aa) of the ORF (Figure 7) . Of the 161 
nucleotide substitutions unique to a single clone, 71% 
were at the third position of the codon and 72 % were 

25 silent. 

Each of the nine L clones represented the near 
complete ORF of an individual genome. The differences 
among the L clones were 0.30 - 1.53% at the nucleotide and 
0.31 - 1.47% at the amino acid level (Figure 8). Two 
30 clones, LI and L7, had a defective ORF due to a single 
nucleotide deletion and a single nucleotide insertion, 
respectively. Even though the HC-J4 plasma pool was 
obtained in the early acute phase, it appeared to contain 
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at least three viral species (Figure 9) . Species A 
contained the LI, L2, L6, L8 and L9 clones, species B the 
L3, L7 and L10 clones and species C the L4 clone. 
Although each species A clone was unique all A clones 

5 differed from all B clones at the same 20 amino acid sites 
and at these positions, species C had the species A 
sequence at 14 positions and the species B sequence at 6 
positions (Figure 7) . 

10 Okamoto and coworkers (Okamoto et al . , 1992b) 

previously determined the nearly complete genome consensus 
sequence of strain HC-J4 in acute phase serum of the first 
chimpanzee passage (HC-J4/83) as well as in chronic phase 
serum collected 8.2 years later (HC-J4/91) . In addition, 

15 they determined the sequence of amino acids 379 to 413 
(including HVR1) and amino acids 46 8 to 486 (including 
HVR2) of multiple individual clones (Okamoto et al . , 
1992b) . 

20 It was found by the present inventors that the 

sequences of individual genomes in the plasma pool 
collected from a chimpanzee inoculated with HC-J4/91 were 
all more closely related to HC-J4/91 than to HC-J4/83 
(Figures 8, 9) and contained HVR amino acid sequences 
closely related to three of the four viral species 
previously found in HC-J4/91 (Figure 10) . 

Thus, the data presented herein demonstrate the 
occurrence of the simultaneous transmission of multiple 

30 species to a single chimpanzee and clearly illustrates the 
difficulties in accurately determining the evolution of 
HCV over time since multiple species with significant 
changes throughout the HCV genome can be present from the 
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onset of the infection. Accordingly, infection of 
chimpanzees with monoclonal viruses derived from the 
infectious clones described herein will make it possible 
to perform more detailed studies of the evolution of HCV 
in vivo and its importance for viral persistence and 
pathogenesis. 

EXAMPLE 6 

10 Determination Of The Consensus 

Sequence Of HC-J4 In The Plasma Pool 

The consensus sequence of nucleotides 156-9371 
of HC-J4 was determined by two approaches. In one 
approach, the consensus sequence was deduced from 9 clones 
of the long RT-PCR product. In the other approach the 
long RT-PCR product was reamplified by PCR as overlapping 
fragments which were sequenced directly. The two 
"consensus" sequences differed at 31 (0.34%) of 9216 

20 nucleotide positions and at 11 (0.37%) of 3010 deduced 
amino acid positions (Figure 7) . At all of these 
positions a major quasispecies of strain HC-J4 was found 
in the plasma pool. At 9 additional amino acid positions 
the cloned sequences displayed heterogeneity and the 
direct sequence was ambiguous (Figure 7) . Finally, it 
should be noted that there were multiple amino acid 
positions at which the consensus sequence obtained by 
direct sequencing was identical to that obtained by 

30 cloning and sequencing even though a major quasispecies 
was detected (Figure 7) . 

For positions at which the two "consensus" 
sequences of HC-J4 differed, both amino acids were 
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included in a composite consensus sequence (Figure 7) . 
However, even with this allowance, none of the 9 L clones 
analyzed (aa 1 - 2864) had the composite consensus 
sequence: two clones did not encode the complete 
J polypeptide and the remaining 7 clones differed from the 
consensus sequence by 3 - 13 amino acids (Figure 7) . 

EXAMPLE 7 

10 Construction Of Chimeric Full -Length cDNA 

Clones Containing The Entire ORF Of HC-J4 

The cassette vector used to clone strain H77 was 
used to construct an infectious cDNA clone containing the 
ORF of a second subtype. 

In brief, three full-length cDNA clones were 
constructed by cloning different L fragments into the 
PinAl/Bglll site of pCV-J4S9, the cassette vector for 
genotype la (Figure 6) , which also contained an S fragment 
20 encoding the consensus amino acid sequence of HC-J4. 

Therefore, although the ORF was from strain HC-J4, most of 
the 5' and 3' terminal sequences originated from strain 
H77. As a result, the 5' and 3' UTR were chimeras of 
25 genotypes la and lb (Figure 11) . 

The first 155 nucleotides of the 5' UTR were 
from strain H77 (genotype la) , and differed from the 
authentic sequence of HC-J4 (genotype lb) at nucleotides 
11, 12, 13, 34 and 35. In two clones (pCV-J4L2S, pOV- 
J4L6S) the rest of the 5' UTR had the consensus sequence 
of HC-J4, whereas the third clone (pCV-J4L4S) had a single 
nucleotide insertion at position 207. In all 3 clones the 
first 27 nucleotides of the 3' variable region of the 3' 

35 
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UTR were identical with the consensus sequence of HC-J4. 
The remaining 15 nucleotides of the variable region, the 
poly U-UC region and the 3' conserved region of the 3' UTR 
had the same sequence as an infectious clone of strain H77 

5 

(Figure 11) . 

None of the three full-length clones of HC-J4 
had the ORF composite consensus sequence (Figures 7, 12) . 
The pCV-J4L6S clone had only three amino acid changes: Q 
10 for R at position 231 (El), V for A at position 937 (NS2) 
and T for S at position 1215 (NS3) . The pCV-J4L4S clone 
had 7 amino acid changes, including a change at position 
450 (E2) that eliminated a highly conserved N- linked 
glycosylation site (Okamoto et al . , 1992a). Finally, the 
pCV-J4L2S clone had 9 amino acid changes compared with the 
consensus sequence of HC-J4. A change at position 304 
(El) mutated a highly conserved cysteine residue (Bukh et 
al., 1993; Okamoto et al., 1992a). 

20 

EXAMPLE 8 

Transfection Of A Chimpanzee By In 
Vitro Transcripts Of A Chimeric cDNA 

25 The infectivity of the three chimeric HCV clones 

was determined by ultra-sound-guided percutaneous 
intrahepatic injection into the liver of a chimpanzee of 
the same amount of cDNA and transcription mixture for each 
of the clones (Figure 5) . This procedure is a less 

^ invasive procedure than the laparotomy procedure utilized 
by Kolykhalov et al . (1997) and Yanagi et al . (1997) and 
should facilitate in vivo studies of cDNA clones of HCV in 
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chimpanzees since percutaneous procedures, unlike 
laparotomy, can be performed repeatedly. 

As shown in Figure 13 , the chimpanzee became 
infected with HCV as measured by increasing titers of 10 2 
5 GE/ml at week 1 p.i., 10 3 GE/ml at week 2 p.i. and 10 4 - 
10 5 GE/ml during weeks 3 to 10 p.i. 

The viremic pattern found in the early phase of 
the infection was similar to that observed for the 
10 recombinant H77 virus in chimpanzees (Bukh et al . , 

unpublished data; Kolykhalov et al., 1997; Yanagi et al., 
1997) . The chimpanzee transfected in the present study 
was chronically infected with hepatitis G virus (HGV/GBV- 
C) (Bukh et al., 1998) and had a titer of 10 6 GE/ml at the 
15 time of HCV transf ection . Although HGV/GBV-C was 

originally believed to be a hepatitis virus, it does not 
cause hepatitis in chimpanzees (Bukh et al . , 1998) and may 
not replicate in the liver (Laskus et al., 1997). The 
20 present study demonstrated that an ongoing infection of 
HGV/GBV-C did not prevent acute HCV infection in the 
chimpanzee model. 

However, to identify which of the three full- 
length HC-J4 clones were infectious, the NS3 region (nts. 
3659 - 4110) of HCV genomes amplified by RT-PCR from serum 
samples taken from the infected chimpanzee during weeks 2 
and 4 post,- infection (p.i.) were cloned and sequenced. As 
the PCR primers were a complete match with each of the 
30 original three clones, this assay should not have 

preferentially amplified one virus over another. Sequence 
analysis of 26 and 24 clones obtained at weeks 2 and 4 
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p.i., respectively, demonstrated that all originated from 
the transcripts of pCV-J4L6S. 

Moreover, the consensus sequence of PCR products 
of the nearly complete genome (nts. 11-9441), amplified 
from serum obtained during week 2 p.i., was identical to 
the sequence of pCV-J4L6S and there was no evidence of 
quasispecies . Thus, RNA transcripts of pCV-J4L6S, but not 
of pCV-J4L2S or pCV-J4L4S, were infectious in vivo. The 
data in Figure 13 is therefore the product of the 
transfection of RNA transcripts of pCV-J4L6S. 

In addition, the chimeric sequences of genotypes 
la and lb in the UTRs were maintained in the infected 
chimpanzee. The consensus sequence of nucleotides 11 - 
341 of the 5' UTR and the variable region of the 3' UTR, 
amplified from serum obtained during weeks 2 and 4 p.i., 
had the expected chimeric sequence of genotypes la and lb 
(Fig. 11) . Also three of four clones of the 3' UTR 
obtained at week 2 p.i. had the chimeric sequence of the 
variable region, whereas a single substitution was noted 
in the fourth clone. However, in all four clones the poly 
U region was longer (2-12 nts) than expected. Also, extra 
C and G residues were observed in this region. For the 
most part, the number of C residues in the poly UC region 
was maintained in all clones although the spacing varied. 
As shown previously, variations in the number of U 
residues can reflect artifacts introduced during PCR 
amplification (Yanagi et al., 1997). The sequence of the 
first 19 nucleotides of the conserved region was 
maintained in all four clones. Thus, with the exception 
of the poly U-UC region, the genomic sequences recovered 
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from the infected chimpanzee were exactly those of the 
chimeric infectious clone pCV-J4BL6S. 

The results presented in Figure 13 therefore 
demonstrate that HCV polypeptide sequences other than the 
consensus sequence can be infectious and that a chimeric 
genome containing portions of the H77 termini could 
produce an infectious virus. In addition, these results 
showed for the first time that it is possible to make 
10 infectious viruses containing 5' and 3' terminal sequences 
specific for two different subtypes of the same major 
genotype of HCV. 
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EXAMPLE 9 

Construction Of A Chimeric 
la/lb Infectious Clone 



A chimeric la/lb infectious clone in which the 
structural region of the genotype lb infectious clone is 
inserted into the la clone of Yanagi et al. (1997) is 
constructed by following the protocol shown in Figure 15. 
The resultant chimera contains nucleotides 156-2763 of the 
lb clone described herein inserted into the la clone of 
25 Figures 4A-4F. The sequences of the primers shown in 
Figure 15 which are used in constructing this chimeric 
clone, designated pH77CV-J4, are presented below. 
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1. H2751S (Cla I/Nde I) 

CGT CAT CGA TCC TCA GCG GGC ATA TGC ACT GGA CAC GGA 

2. H2870R 

CAT GCA CCA GCT GAT ATA GCG CTT GTA ATA TG 

5 3 . H7851S 

TCC GTA GAG GAA GCT TGC AGC CTG ACG CCC 

4. H9173 R(P-M) 

GTA CTT GCC ACA TAT AGC AGC CCT GCC TCC TCT G 

5. H914QS (P-M) 

10 CAG AGG AGG CAG GGC TGC TAT ATG TGG CAA GTA C 

6 . H9417R 

CGT CTC TAG ACA GGA AAT GGC TTA AGA GGC CGG AGT GTT 
TAC C 

7. J4-2271S 

15 TGC AAT TGG ACT CGA GGA GAG CGC TGT AAC TTG GAG 

8. J4-2776R (Nde I) 

CGG TCC AAG GCA TAT GCT CGT GGT GGT AAC GCC AG 

Transcripts of the chimeric la/lb clone (whose 
20 sequence is shown in Figures 16A-16F) are then produced 

and transfected into chimpanzees by the methods described 
in the Materials and Methods section herein and the 
transfected animals are then be subjected to biochemical 
(ALT levels) , histopathological and PCR analyses to 

23 

determine the infectivity of the chimeric clone. 

EXAMPLE 10 

Construction of 3' Deletion Mutants 
30 Qf The la Infectious Clone PCV-H77C 

Seven constructs having various deletions in the 
3' untranslated region (UTR) of the la infectious clone 
pCV-H77C were constructed as described in Figures 17A-17B. 
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The 3' untranslated sequence remaining in each of the 
seven constructs following their respective deletions is 
shown in Figures 17A-17B. 

Construct pCV-H77C (-98X) containing a deletion 

5 

of the 3'-most 98 nucleotide sequences in the 3'-UTR was 
transcribed in vitro according to the methods described 
herein and 1 ml of the diluted transcription mixture was 
percutaneously transfected into the liver of a chimpanzee 
10 with the aid of ultrasound. After three weeks, the 

transfection was repeated. The chimpanzee was observed to 
be negative for hepatitis C virus replication as measured 
by RT-PCR assay for 5 weeks after transfection. These 
results demonstrate that the deleted 98 nucleotide 3'-UTR 
sequence was critical for production of infectious HCV and 
appear to contradict the reports of Dash et al . (1996) and 
Yoo et al . (1995) who reported that RNA transcripts from 
cDNA clones of HCV-1 and HCV-N lacking the terminal 98 
conserved nucleotides at the very 3' end of the 3'-UTR 
resulted in viral replication after transfection into 
human hematoma cell lines. 

Transcripts of the (-42X) mutant (Figure 17C) 
25 were also produced and transfected into a chimpanzee and 
transcripts of the other five deletion mutants shown in 
Figures 17D-17G) are to be produced and transfected into 
chimpanzees by the methods described herein. All 
transfected animals are to then be assayed for viral 
replication via RT-PCR. 
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Discussion 

In two recent reports on transfection of 
chimpanzees, only those clones engineered to have the 
independently determined and slightly different consensus 
amino acid sequence of the polypeptide of strain H77 were 
infectious (Kolykhalov et al., 1997; Yanagi et al . , 1997). 
Although the two infectious clones differed at four amino 
acid positions, these differences were represented in a 
major component of the quasispecies of the cloning source. 
In the present study, a single consensus sequence of 
strain HC-J4 could not be defined because the consensus 
sequence obtained by two different approaches (direct 
15 sequencing and sequencing of cloned products) differed at 
20 amino acid positions, even though the same genomic PGR 
product was analyzed. The infectious clone differed at 
two positions from the composite amino acid consensus 
sequence, from the sequence of the 8 additional HC-J4 
clones analyzed in this study and from published sequences 
of earlier passage samples. An additional amino acid 
differed from the composite consensus sequence but was 
found in two other HC-J4 clones analyzed in this study. 
25 The two non- infectious full-length clones of HC-J4 

differed from the composite consensus sequence by only 7 
and 9 amino acid differences. However, since these clones 
had the same termini as the infectious clone (except for a 
30 single nucleotide insertion in the 5' UTR of pCV-J4L4S) , 

one or more of these amino acid changes in each clone was 
apparently deleterious for the virus. 
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It was also found in the present study that HO 
J4, like other strains of genotype lb (Kolykhalov et al . , 
1996; Tanaka et al., 1996; Yamada et al . , 1996), had a 
poly U-UC region followed by a terminal conserved element. 
J The poly U-UC region appears to vary considerably so it 

was not clear whether changes in this region would have a 
significant effect on virus replication. On the other 
hand, the 3' 98 nucleotides of the HCV genome were 

10 previously shown to be identical among other strains of 
genotypes la and lb (Kolykhalov et al., 1996; Tanaka et 
al., 1996) . Thus, use of the cassette vector would not 
alter this region except for addition of 3 nucleotides 

^ found in strain H77 between the poly UC region and the 3' 
98 conserved nucleotides. 

In conclusion, an infectious clone representing 
a genotype lb strain of HCV has been constructed. Thus, 
it has been demonstrated that it was possible to obtain an 

20 infectious clone of a second strain of HCV. In addition, 
it has been shown that a consensus amino acid sequence was 
not absolutely required for infectivity and that chimeras 
between the UTRs of two different genotypes could be 

">c viable. 
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WHAT IS CLAIMED IS: 

1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C virus, said 

5 

molecule capable of expressing said virus when transfected 
into cells. 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid sequence 

10 shown in Figures 14G-14H. 

3. The nucleic acid molecule of claim 2, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 14A-14F. 

4 . The nucleic acid molecule acid molecule of 
claim 1, wherein said molecule encodes the amino acid 
sequence shown in Figures 4G-4H. 

5. The nucleic acid molecule of claim 4, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 4A-4F. 

6. The nucleic acid molecule of claim 1, 
wherein a fragment of said molecule which encodes the 
structural region of. hepatitis C virus has been replaced 
by the structural region from the genome of another 
hepatitis C virus strain. 

7. The nucleic acid molecule of claim 6, 
wherein said molecule encodes the amino acid sequence 
shown in Figures 16G-16H. 

8. The nucleic acid molecule of claim 7, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 16A-16F. 
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9. The nucleic acid molecule of claim 1, 
wherein a fragment of the nucleic acid molecule which 
encodes at least one HCV protein has been replaced by a 
fragment of the genome of another hepatitis C virus strain 
which encodes the corresponding protein. 

10. The nucleic acid molecule of claim 9, 
wherein the protein is selected from the group consisting 
of El, E2 or NS4 proteins. 

10 11. The nucleic acid molecule of claim 1, 

wherein a fragment of the molecule encoding all or part of 
an HCV protein has been deleted. 

12. The nucleic acid molecule of claim 11, 
wherein the HCV protein is selected from the group 
consisting of P7, NS4B or NS5A proteins. 

13 . A DNA construct comprising a nucleic acid 
molecule according to claims 1, 3, 5 or 8 . 

14 . An RNA transcript of the DNA construct of 

20 claim 13. 

15. A cell transfected with the DNA construct 
of claim 13 . 

16 . A cell transfected with RNA transcript of 

claim 14. 

25 

17. A hepatitis C virus polypeptide produced by 
the cell of claim 15. 

18. A hepatitis C virus polypeptide produced by 
the cell of claim 16. 

30 19. A hepatitis C virus produced by the cell of 

claim 13. 

20. A hepatitis C virus produced by the cell of 

claim 14. 
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21. A hepatitis C virus whose genome comprises 
a nucleic acid molecule according to claims 1, 3, 5, 6, 8, 
or 9. 

22. A method for producing a hepatitis C virus 
comprising transfecting a host cell with the RNA 
transcript of claim 14 . 

23. A polypeptide encoded by a nucleic acid 
sequence according to claims 1, 2, 4 or 7 or a fragment 

10 thereof . 

24. The polypeptide of claim 23, wherein said 
polypeptide is selected from the group consisting of NS3 
protease, El protein, E2 protein or NS4 protein. 

25. A method for assaying candidate antiviral 
agents for activity against HCV, comprising 

a) exposing a cell containing the hepatitis C 
virus of claim 21 to the candidate antiviral agent; and 

b) measuring the presence or absence of 

20 hepatitis C virus replication in the cell of step (a) . 

26. The method of claim 25, wherein said 
replication in step (b) is measured by at least one of the 
following: negative strand RT-PCR, quantitative RT-PCR, 
Western blot, immunof luoresence , or infectivity in a 

25 

susceptible animal . 

27. A method for assaying candidate antiviral 
agents for activity against HCV, comprising: 

a) exposing an HCV 
30 protease encoded by a nucleic acid 

sequence according to claims 1, 2, 4, 
or 7, or a fragment thereof to the 
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candidate antiviral agent in the 
presence of a protease substrate; and 
b) measuring the protease 
activity of said protease. 
^ 28. The method of claim 27, wherein said HCV 

protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease. 
10 29 . An antiviral agent identified as having 

antiviral activity for HCV by the method of claim 25. - 

30. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 27. 

31. Antibody to the polypeptide of claim 23. 

32. Antibody to the hepatitis C virus of claim 
21. 

33. A method for determining the susceptibility 
of cells in vitro to support HCV infection, comprising the 

20 steps of : 

a. growing animal cells in 

vitro; 

b. transfecting into said 
cells the nucleic acid of claim 1; and 

c. determining if said 
cells show indicia of HCV replication. 

34. The method according to claim 33, wherein 
said cells are human cells. 

30 35. A cassette vector for cloning viral 

genomes, comprising, inserted therein, the nucleic acid 
sequence according to claim 2, said vector reading in the 
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correct phase for the expression of said inserted sequence 
and having an active promoter sequence upstream thereof. 

36. The cassette vector of claim 35, wherein 
the cassette vector is produced from plasmid pCV. 

37. The cassette vector of claim 35, wherein 
the vector also contains one or more expressible marker 
genes . 

38. The cassette vector of claim 35, wherein 
the inserted DNA sequence contains at least one ORF of the 
HCV genome from any strain. 

39. The cassette vector of claim 35, wherein 
the promoter is a bacterial promoter. 

40. A composition comprising a polypeptide of 
claim 23 suspended in a suitable amount of a 
pharmaceutical^ acceptable diluent or excipient. 

41. A method for treating hepatitis C viral 
infection comprising the administration to a animal in 

20 need thereof of a clinically effective amount of the 
composition of claim 40. 

42. A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutical^ acceptable diluent or excipient. 

43. A method for treating hepatitis C viral 
infection comprising the administration to an animal in 
need thereof of a clinically effective amount of the 
composition of claim 42. 
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Four clones used for constructing pCV-H77C (infectious clone) 
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GCCAGXCQC TGAIQ3333C GACACIQCAC CAIGAA1CAC TOOX'IG'IUA 50 

GGAACIACIG '1LT1UA03CA GAAAQOGTCT AQOCRIIQQQG TIAGEKIGfiG 100 

T3ICGIGCAG QCTECCAG3AC aXDOCTCXT GOGAGAQCCA TAGIOGIUIG 150 

CX3GAAOCX33T GAGTACAC03 GAATIGOCAG GAOGAO03GG TGCTTICTIG 200 

GATAAAOOOG CICAAlGaCT GGAGATTIGG GUaiGOOXC QC^AGACIGC 250 

TAQCOGAGEft. GTGTIGGGIC GOGAAAGQ3C T1ULUJJLACT QCXZOGftTROS 300 

GIGCTIQOGA GT30CXD333 AUJlL'lLUiA GftGOGIQCftC CAIGAQCACG 350 

AftlCCEftAAC CICAAAGAAA. AAOCAAAGSF AACAOCAAOC U1GGUGCACA 400 

QGADGICAftG TICG0335IG GOQ3K3GAT 03ITQ3IG3A. CJTTEACTK3T 450 

TGQQQOQCftG GGQOOCEftGA. TIQ03IGIQC Q0333A03AG GAAGALT1UL' 500 

GAG33GTED3C AAOCTCGAGG TfiGftOGflCAG CDCIRTCCOCA AGGCAQ3IOG 550 

QCCCGAG33C AG3AOTIQ33 CICAG3C033 GTACCCTIGG 033TIUEAIG 600 

QCAAIGAGQ3 T1UUUGO K G GQ333A3GQC TOCHXSICIQC OJLJ1UXTCT 650 

03QCCTAQCT GG9GGOOCAC AGAOXC333 03iaGGfI03C GCAATTIGGG 700 

TAAGGICATC GATAQQCT3A CGIQOQQCIT 03CXDGAOCTC ALLG33GTACA. 750 

TROOGCIOoT CGQGGOCOCT CTIGGAG30G CIQOCAGGGC CCIQ3GGCAT 800 

GOQGICQ3GG TICIGGAAGA GQQ03IGAAC TAIGCAAOG GGAAQCITOC 850 

TOCTTGCICT TICICEAICT TOCHCIQX CC1UJ1C1CT TO0CT3OG 900 

TQCXX3QCTK: AGGCIACCAA GTGQ3CAATT UJ1033Q3CT TEACCATCIC 950 

ACCAAIGATT GXCI^ACIC GAGIATIGTG TAOGAQ3333 ODGATOGCAT 1000 

CCIGCACACr 0CQ333TGIG ' lUULTlumi' TCXX3GAG33T AAOSXTOGA 1050 

(jjiurium r ggossigacc cxxsmsios qcaccaggga. cqqcaaack: 1100 

CCCACAAQQC AGZTK33ACG TCAIATCGAT C1ULT1U1CG GGAGCQCCAC 1150 

OCICIQCT3 G GCCCICEAOS TGGGGGAOCT GToGGOGflCT GTCITTCTIG 1200 

Tl'GGTCAACT GTTEAGCITC TCIOOCM3QC QOCACIQGAC GAG3CAAGAC 1250 

T3CAATIGIT CiMUMICC CGGOGATAXA AOJJJ1CAIC GCA3GGGAIG 1300 

GGATATCATC ATCAACIGGT CQOCTOOOQC AGOCJTIGGTG GTAGCJICAGC 1350 

TOCTOCX3GAT CCEACAAGCZ ATCA2GGACA TGA303CIGG TGCICACIQG 1400 

GGAGTCCIGG 0333GATAO: GT3OTICT0C AIGGIGGG3A ACT33Q0GAA 1450 

GGTCCIGGTA GTGCIGCIGC TASTIGOCD3 03ID3AO3CG GAAAOGCAOG 1500 

TCACQGGGGG AAA3GG0GGC CQCACCAC33 CIQ33CTIGT 'lOJICICCTT 1550 

ACACCAQ3CG QCMGIAGAA CAIOCAACIG AICAACACCA A0Q3CAGTTG 1600 

GCACAICAAT AGCA033QCT T3AATIGCAA TGAAAOOCTT AACAOJ33Cr 1650 

G3TEAGCAGG G L ' ILTIC IRT CAACACAAAT TCAACICTTC AGaJlulCd' 1700 

GAGAGGTIGG CCAQCTGCD3 PO30CTTPCC GATTTTGCCC AGQGCTQQQG 1750 

TCCTATCAGT TATGCCAAOG GAAGQ33XT 03AOGAA03C COCTACIQCT 1800 

GGCACTACCC TCCAAGACCT T3TGGCATIG T33GQ3CAAA GAGG3IGTGT 1850 

GGCCCGGTAT ATIQCTTCAC TCOCAGOOOC GT3GIGGTG3 GAAOSAGOGA. 1900 

FIG.4A 
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CA0GIO3Q3C G030CIAaCT ACAGCIOQ33 TOCAAATCAT ACQGAlUlLT 1950 

'lO Jl UL T i m CAACACCAG3 OCAOOGCIGG QCAATIG3IT 033ITGEA0C 2000 

T3GAIGAACT CAACTGGATT CAQCAAAGIG TO03GAG0SC CIU L TIUIUI' 2050 

CATO33AQ33 GIQG3CAACA ACAOZTIOCT CSOCXOCACT GAl'lULTlUC 2100 

GCAAACATQC GGAAGOCACA TACTCT033T Q03GCTOJUG TDXTGGATT 2150 

acbcocaqgt gcmqsigga ctaoxgiat Aoacrnaac AcraiacTTG 2200 

TA0C3OCAAT TACACCA1AT TCAAfiGICAG GA1GTACGIG GSBG9QSXGS 2250 

AQCaCAGQCT GGAAG3320C T3ZAACI03A OQCX333Q03A. A03CIGIGAT 2300 

CK3GAAGACA GQGAGAG3TC CGfiGCICftGC OGL.TIUC'IUC TGIOCAOCAC 2350 

ACAGTGQCAG (JUJ L TUJU G Jf GTlLTi'lCAC GAUUL'IGCXIA G0LT1O1UJA 2400 

CCQ3CCICAT CCACCTCCAC CAGAACATIG T33ACGTOCA GEPCTIGflAC 2450 

GQ93EAG3ST CAAGCAIGOC GIOCIQ330C ATTAAGT333 ACTRCGTOST 2500 

'IL'lUL'lU l 'l U CriUlULTlU CAGAQQO30G OJIUIUL'IUJ 1QLT1U1ULA 2550 

TCAIGTEACT CATATOXAA G03GAQ033G CITTGGAGAA CX7ID3IAATA 2600 

CTOAAT3CAG 00000103: CG3GACGCAC GTCCTIGIGT GLTiLUlUJl' 2650 

GTIOTIOIQC '1T1 U U U1UCT ATCIOAAQ33 TAailUJCJlG 00003X3333 2700 

TCIACQCCCT CTACOOGAIG T0300TOIO0 TCOIOOTOOT GC3G303TTG 2750 

CCICAQ032G CATACGCACT GGACA033AG GTGGG0G03T COIGIOG093 2800 

CG1TGTICTT GT0Q33ITAA TQ0O30IGAC TCIGTOQQCA TATEACAAQC 2850 

GCTAIATCAG CTOOIGCATC T33IO30TTC AGTKCTnCT GAOCAGAGTA. 2900 

GAAGCGCAAC TGCACGTOIG G3TIO0000C CTCAA03TOC Q333333QOG 2950 

0GAIO003IC ATCTTACICA TCIGIOEAGT ACACOOGACC CTOOTATTIG 3000 

ACATCACCAA ACTACIOCTG GGCATCTTOG GAOOOOTTIG GATICTTCAA 3050 

QCCAGTTIGG TIAAAGTOOC CTACTTCGTG CGCGTICAAG GOLT1U1UUG 3100 

GATCTO0Q03 CTAG0QQ3GA AGAIAG0033 AG3IOATEAC G?IGCAAAIQC3 3150 

QCATCATCAA GTEAQQQQCG CITACTGGCA OOTA1U1G1A TAACCATCIC 3200 

AOOOOTCTTC GAGACIQ3GC O2ftCAA0G3C CTOC3AGATC 1Q3XOIGGC 3250 

TGTGGAACCA G l UJlL ' i ' iCT C003AAIGGA GACCAAGCIC ATCA03TQQG 3300 

G30CAGATAC CG00303IGC Q3IGACATCA TCAAO330IT Q-U-OlCIOl' 3350 

QCCCGTAG33 GOGAGGAGAT ACIG0TIQ3G CCAGCCGACG GAASGGICIC 3400 

CAAQQ33IQG AQ0FIQCIG3 Q3000ATCAC G3OGTACGO0 CAQCAGACGA 3450 

GAGGOGIOOT AQQ3IGTATA A3CA0CAGGC TGACIGG0CO GSACAAAAAC 3500 

CAAGIG3AGG GTGAQ3ICCA GA103IGICA ACTOOTAOGG AAAOCTIOCT 3550 

GGCAAD3IOC AICAAIQ32G TA2G0TO3AC '1GTUEACCAC O330CO3SAA 3600 

CGAQGAQCAT CGCATCACOC A2G03IOGIG TCAICCAGAT GEA2AO0AAT 3650 
GIOGACCAAG ACCTIGTQOG CTG3CO030T COIGAAGGTT COOGCICATT 3700 
GACACCCIGT ACCIG033CT CCI03GACCT TEAO0TG3TC A03AQQCACG 3750 
CCGATCICAT TCCOGTQ03C CG90GAO3IG ATAGCAG333 TAGOOIGOTT 3800 

FIG. 4B 
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TCGGC03332 CCmTlULTA CTIGAAAGQC TOCIQGJQ3QG GKX23QU1T 3850 

GIGC3CX33GG3 G3ACACOT3 TQ33XJTRTT CAQ330030G GIGIQCAOOC 3900 

GT33AGIG3C TAAAGOQGIG GACTETAIDC CIGIQ3AGAA. CCU\Q33ACA. 3950 

AGCAIGAGAT CGC033IGIT CAOQGACAAC T3TICICCAC CAGCAGIQOC 4000 

CCAGAGTITC eaOGU GGOa: AOCIQCAIGC TOOCAQ093C AQ033IAAGA 4050 

QCAOCAAQoT OJUJJL'IUOG TAO3CAQ00C G3IGITQ3IG 4100 

CICRAO00CT CIGTTOCT3C &A03CIQQX Tl'HJUlUL'lT AO KIUIULA A 4150 

GQ00CAIG33 Ui'lUfiiiUL'lA AIATCAGGAC CX3333IGAGA. ACAATIRaCA 4200 

CIGXAGOX CATCA03IEC TraOCIAOS GCAAGTiaCT TQDJGATOC 4250 

QSO'IGCTCAG GA QGHX3CTEA TGACAH^KEA. ATITGIGA03 AG1GUCACIC 4300 

CAOQSATQQC ACASCEAICT T333GATO33 CACIUIULTJ.' GAOCAAGCAG 4350 

AGACIGOQ3G Q0O3AGACIG GTIGIGCIOG <X20GCEAC COCTOGOaX 4400 

TCQC3ICACIG TCIOQC1A3XX TAACAIOGAG GAQGTTIOCIC TGICEAOCAC 4450 

033AGAGAIC CO LTlTlft CS OJUX'IGGAG GIGATCAAGG 4500 

QGQ3AAGACA. TCICAJCriC T3GCACICAA AGAAGAAGIG QGAQGAGCIC 4550 

GCOQQGAAGC TOGICGCATT Q3GCAICAAT GOCXJEGGOCT ACL2GGO03G 4600 

TCITCACGIG TCIGICAIDC QGA0CAQ033 CGA3GTTGTC GIO3IGT0GA 4650 

CCGAIQCTCT CATGACTG3C TITAD03333 ACTKGACIC TGTGATAGAC 4700 

TOCAACAG3T GIGTCACTCA GACAGTOGAT TICAGCCTIG AOOCIAQCIT 4750 

TACCATIGAG ACAAGCAG3C TCCOOCAQGA TOC2GICIDC AGGACTCAAC 4800 

QCCG333CAG GACIQ3CAQG G33AAGXAG CCATCTAIAG ATTIGTGXA 4850 

C033333AQC GCCQCTC03G CAHJl'lLlaAC TOGTDOCCTX TUIGIGAGIG 4900 

CTA1GACG03 GG L ' IUIUL'IT GGTA1GAGCT CACGOOQGX GAGACDOG 4950 

TEAG3CIADG AGGGTACAIG AACAC0Q03G GGCTI0003T GTQQCAGGAC 5000 

CALLLT1GAAT TTIG33ftG3G OJILTITACG QGOTICACTC A1AIRGKIQC 5050 

OCA LTITI ' IA TCOCAGACAA AGCAGfiGIGS GGAGAACTTT OCTEAOCIGG 5100 

TAGOGEAGCA AQCEACCETIG TOCITIME CICAAG03X UUJOJCAICG 5150 

TGGGAGCAGA. T3IGGAAGIG TTIGAICOX CTTAAAOOCA QCCTCCATGG 5200 

QGCAACACCC CTGCIATACA GACTO3G03C TGTICAGAAT GAAGICAOOC 5250 

T3ACQCAGCC AATCAQCAAA TACA1CATCA CAiaCAIGTC G3003AGCIG 5300 

GAGGICGICA O3AGCA0CIG G3IQCIG3TT G333303IOC TOQCIQCIET 5350 

GGOOSGGIAT TGQCIGTCAA CAG3CIGCGr GGTCATAGIG G3CAQG A1G3 5400 

'1L T1 U1UH3S GAAGGOQXA. ATIEIACCTG ACAGOGAQ3T TCICIAOCAG 5450 
GAGTTCGA3G AGAT3GAAGA GlUC'ICiCAG CACnACOST ACATOGAQCA 5500 
AGQGAT3ATC ClUJC'IGAQC AGTICAAGCA GAAGOOOCIC GQGL'KXTOC 5550 
AGACQGQ3TC 03G0CAIGCA GAG3TTATCA CJUUL'lUL'lUr OCAGAOCAAC 5600 
T3QCAGAAAC TOGAG SICIT TIG33QGAAG CACATCIQ3A ATTICA3CAG 5650 
TQQQATACAA TACTTOQOGG GOCIGICAAC GCIUX'IGGT AAOOCOQCCA 5700 

FIG. 4C 
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T1ULT11MT GAIG3ZETIT ACAGCIGOOG TCAOCAG3X ACiaAOCftCT 5750 

QQOCftAAQQC ' lUL ' lLTllA A. CKEftr iG^ OljlTjGC'lUGt C' l OUL a GCT 5800 

asocoaaoac asiaaxcm ciQaenror aasiociGQC cesqcigqcb 5850 

CQQaCATOQG CAQ03TIQ3A. CIG33GAAQ3 T0CIO3IQGA GATICTIQCA 5900 

G3CTA3X3333 OQ3QQ3IQ3C GQ3AGCICIT CTftGCATICA. AGATCAIGAG 5950 

CX3SIG&GGIC O0CICCAO33 AGGACCIGGT CAA3CB3CIG ODaaCOOOC 6000 

'IL' IUU GCTC G AJUHJl ' lUiA GID33IGIG3 TCIGOQGAGC MBOQ03C 6050 

OGGCa mi ' lG GCXX3G330GA. QQ3Q3ZAGIG CMIQGftlGA 6100 

AGULT1UUOC TC00Q33QGA ALXJAIiUlTlL' CXTCA03CAC TAQGgGXOG 6150 

AGAGQGA1QC AGOOQOOQQC GK3OQ0CA TACICAGC&G OCICACIG'JA 6200 

ACOCAQCT3C TGftQQOSACT QCATCAGIQ3 AIAftX'lOGG A3IGIBOCAC 6250 

TCCAIGCT3C QCTIOCIQQC TAAG33ACAT CI03GACIQG A3KIQ33AGG 6300 

TGCIGAQC3GA. CTTTAAGAOC TQ3CIGAAftG QCAftGCTCfiir Q0C3OACIG 6350 

GCIQQGATIC OCTTIQIGIC CIQOCAGCOC Q33IKI&33G QQSICIQQCG 6400 

AQGAGA033C ATTATQCACA CBXCTGOCA GAGMOCTG 6900 

GACAIGICAA AAACQQGAOG ATCAGGATO3 TO3GTOCTAG GBGJ1UCAG3 6950 

AACASGIGGA GTP333A03IT C032ATEAAC OCTODCA GQ0Q00CCIU 6550 

TACID0OCIT QCIQO3Q03A. PCmrbNJTT GQ33CIGIQ3 AG3GIGICIG 6600 

CAGAQGAAIA. OGIGGAGAIA AG303Q3IQG Q3GACITGCA. CEACGI2\IOS 6650 

QGHCAIGACEA. CTCACAA1CT TAAAIQOOCG T30CAGA2OC CA3O3000GA 6700 

ATnnCACA. GAATTG3A03 0QGIO33aCr ACACAQ3ITT QOGGGOQCIT 6750 

QCAAGOOCIT GC1Q0GQ3AG GAG3EAICAT TCAGAGIBGG ACTOCAOGftG 6800 

TAOOOGoIGS OGTCQCAATT ALX-TiU03AG OCX33AACX333 AC3GTAQ003T 6850 

GTIGAOGflCC AIOZECACIG AIQCCICOCA. TATAACAOIA. GAGQ033XG 6900 

QGAGAAQGIT G30GAGAOGG TCAG000CIT CTAIG3CXZPG CICCTOG3CT 6950 

AQOCAQCIGT <XX3CIOC&TC TCICAAG3CA. ALT1UCAC03 OGAAOGAIGA. 7000 

CICOCCTGAC GQOGWXTCA. TAGftGQCIJA CCI0CIGIO3 AGQCAGG&GA. 7050 

TOG3333CAA CATCAOCAG3 GTIGAGICAG PG&O&PGT QGflGftTICIG 7100 

GACICCTTCG Ai UJULTlU l ' G3C£GfiGGft3 A33ICIDCGT 7150 

ACCTOCAGAA. A2TCIQ03GA. AGIUI053AG ATICG30C3Q3 GOCT30COG 7200 

1CIQGQ030G GDQQ3ACEAC AAODQQOQQC TACTAGAGAC CT33AAAAAG 7250 

OTGACEAQ3 AAOCfiOCI GT G3I0CAIO3C TO3333CIAC CAQCT0CAQG 7300 

GTOJGC'ICCT CTO33AAAAA G33EACG3IG GTOCICAOCG 7350 

AA3CAAC0CT A3CEACIQ0C TIOQ303AGC T1UUCAOCAA AAGTTTIOGC 7400 

AQC1CCICAA LT1LU33CAT TAOQ330GAC AAIAOGACAA CMGCICIGA. 7450 

QCC0303XT TCIQ9CIQCC OQ0Q0GACIC CGAOjTIGAG TCCTATICIT 7500 

OCAIO0Q00C OCTK33AG33G GAGZCIQ333 A3O0QGAICT CflQ0GAQGQ3 7550 

TCAT33TDGA. C03ICACTAG T3G33QOGAC ACQGAAGAIG TO3IGIQCIG 7600 

FIG. 4D 
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CICAAIGICT TATIUITGGA CAQQOQCACT GGTCACCmS 1Q03CIQ0QG 7650 

AAGAACAAAA AL'1GD3ZATC AACGCACT3A GCAACTOQTT (X2ACGCCAT 7700 

CACAATCIGG T3TATIOCAC CACTICAGGC AGTOCTIGOC AAAGGCAGAA 7750 

GAAAGTCACA TTIGACAGAC TQCAAGTTCT GGACAGOCAT TAOCAGGAGG 7800 

IQCICAAGoA GSICAAAGCA QGGGCGTCAA AAGIGAAGGC TAALT1ULTA. 7850 

TD33TAGAGG AAQOTOCAG CCTGAGGOOC OZACATICAG GCAAA1GCAA 7900 

(J1T1UJC TAT GGGGCAAAAG A03I003TIG O^tfOGCAGA AAGO00GTAG 7950 

aCOOOEAA. CI U LUI GTGG AAAGAGCTTC T3GAAGACAG T3TAACA0CA 8000 

A1BGACACTA GCATCAIQGC CAAGAAGGAG GTETICIGCE TICAQCCTGA 8050 

GAAG3GGGGT O3TAAG0CAG CI03ICICAT CGTGTIOQCC GAGCTGGGQG 8100 

TQ0Q03IGIG CGAGAAGASG GOOCTGTAGG AOGTGGTTAG CAAQCIOXC 8150 

CIQ30QGIGA. TOGGAAGCTC CTMQGATIC CAA03OCAC CAGGACAGOG 8200 

G3TIGAATIC CT3GTGCAAG G3IGGAAGTC CAAGAAGACC COGAIOGGGT 8250 

1C1CGTA1GA TAGGQQCIGT TTIGACTCCA CAGTCACIGA GAGOGACATC 8300 

OGTAGGGAGG AQGCAATCTA CCAATCTIGT GAOGTQGAGC GOCAAQGOGG 8350 

OGTGGOCATC AAGICCCTCA. CTGAGAGGCT TEATGTTOGG OGOQeTCTTA 8400 

CCAATICAAG QQQQGAAAAC TG033CTA0C GC2GGTGC0G OGGGAGQQGC 8450 

GTACTGACAA CIAGC1U1G G TAACACQCIC ALT1GCTACA TGAAGGOOGG 8500 

Q3CAGCCIUT CGAQGaGCAG GGCICCAGGA CIQCACCAIG CTOGTGTGTG 8550 

GOGAQGACTT AGTOGTIMC T3TGAAAGTG 03GGGGIGCA QGAGGAOGOG 8600 

Q0GAGOCIGA GAQ U LTIC A C GGAG3CTATC AO^GGTACT OCOCOODOCC 8650 

CQ3GGACCCC GCACAAGCAG AATAOGACTT GGAQGTTATA ACATCATGCT 8700 

CTTCCAACGT GTCAGTGGCC CAOGA03303 CTOGAAAGAG GGICIACTAC 8750 

CTEACCOGTG ACCCEACAAC 030000305 AGAGGGGQGT G3GAGACAQC 8800 

AAGACACACT CCAGTGAATT OCIGGCTAGG CAACATAA1C AlUl'l'lUCCC 8850 

OCACAC B3 I G QGQGAQ3AIG AIACIGAIGA O0CATITCIT TAQQ3TGCTC 8900 

ATAGCCAGGG A1CAGCTIGA ACAGGCTCTT AACTGTGAGA TCTAGGGAGC 8950 

CIQCTACTOC AIAGAAOCAC T33AICEACC TOCAAICATT CAAAGACICC 9000 

A1GGQCTCAG OGCA TiTICA CTrACAGTT AC1CIOCAGG TGAAA1CAAT 9050 

AGQGIQGCOG CAIGOCTCAG AAAACTIGGG GICQCGCCCT TGCGAQCTIG 9100 

GAGACACCGG QQCCGGAGCG T3330GCTAG GCTICIGICC AGAGGAGGCA 9150 

G33CTGOCAT ATGIGQCAAG TACCICTICA ACIGGQCAGT AAGAACAAAG 9200 

CTCAAACTCA CICCAATAGC Q30030IGOC GGGCTOGACT ' lUlUXkT IG 9250 

GTTCACQGCT GGCTACAGOG QGGGAGACAT TEATCACAOC GIUILTCATC 9300 

CCOGQCCCQG CIUUI ' IUIQG TTTIGOCEAC T3CTQCTOGC T3CAGG3GTA 9350 

QQCATCTACC TCCTCCCCAA QQGAIGAAGG TIGGQGIAAA CACIO33G0C 9400 

TGTTAAGGCA TTIOCIGTIT TmTl ' JLTlT TITlTmTi' T lTl'lLTlT r 9450 

TTTTmCTT TO CTTIOCTr CTri'lTl'lL C TTTCTTITIC ULTlLTlTftA 9500 

FIG. 4E 
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'lUCjlGXTCC AI1LT1IAQ3DC TAGICAQG3C TAQCTGIGAA AGGIQ03IGA. 9550 
Q333CAIGAC TOCAGAGAGT OCIGM3CT3 QUCICIUIUC AGATCftlGT 9599 

FIG. 4F 
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MSINEKPCPK 1KRNINRRPQ EWKEFQQGQI VQ3VXLLPER G3ERLCWRMR 50 

KISEESQPRG FJRQPIPKAFR EEERIWRQPG YPWPLYGNEG OQWBGWLISP 100 

EGSFPSWGPT DPRRRSRNLG KVIDI'JJIOT AEOGYIPLV GAPIG3AAFA 150 

IAB3VFVLED GVNYAT24LP QCSFSIFLLA LLSCUIVPAS AXQVFNSSGL 200 

YH7HOCENS SIVYEAAEAI IHTPQCVPCV RH3flASFCW AVTPTVaERD 250 

QCLETIQLRR HKLLVGSAT LCSftLYVGDL OGSVELV32L FTFSFRKHWT 300 

TgDOCSIYP GHTIGHFMSW EM4MNW5PIA ALWRQEIEI PQMMCMQG 350 

AHWGVLAGIA IFSMyGNWRK VLWULXEAG VEftEIHWTQG NftLsKlTAGLV 400 

GLLTEG?TOT IQIJNIN3SW HENSIM1CN ESLNIGWUG IFXCHKENSS 450 

GCEEREASCR RLTEEAQGWG PISYAN3SGL DEEPYCWHYP PFPOGIVPAK 500 

SVOGPVYCFT PSPWVGTID F9GAPTYSWG ANDIEVFVUSI NE3PED3WF 550 

GCT^ NS ' IGF 1KVDGAPPCV IGGVGNNILL CPIDCFRKHP EAIYSROGSG 600 

FWTTPIQWD YPYRLMGFC T32WITFKVR MYVGGVEHRL EAAOWIEGE 650 

PCDLEDRERS ELSPLLLSTT QWQVLPCSFT TLPALSIGLI HLH3SII\7EVQ 700 

YLYGVGSSIA SWAIKWEYW IIFLLLAEftR VCSC1JWMMLL ISQAEAALEN 750 

LV3XNAASLA GIH3LVSFLV FFCEAWXLKG RWPGAVYAL YGMW PTTII.T, 800 

LAIP*3*AYAL DIEVAASCX3G WLVGLMALT LSPYYKRYIS Wa4WWD3«FL 850 

1KVEAQLHVW VPPLNVFGGR twtt.tm-W HPTLVFDTIK LLLAIP3PIJW 900 

HQASLLKVP YFVRVQ3LLR ICAIARKIAG GffiTOIAirK LGALT3TWY 950 

NHLTPLREWA HNGLRDLAVA VEPWFSRME 1KLTIW3ADT AA0GDHN3L 1000 

PVSARBGQEI LLGPAIGfi/S KQWBLLAPIT AYAQQIPGLL QCIITSLTGR 1050 

EKNQVEGEVQ IVSIMQTFL ATCINGVCWT VYH3AGIKTI ASPKGPVIQM 1100 

YINVDQDLVG WPAFQGSRSL TFCIGGSSDL YLVTRHAErTI PVEKRGDSKG 1150 

SLLSFRPISY LKGSSOGPLL CPAGHAVGLF FAAVCTRGVA KAVEFIPVEN 1200 

LCTIMRSPVF UNSSPPAVP QSFQVAHLHA PTG9GKSTLW PAAYAAQGYK 1250 

VLVLNPSVAA TLGFGAYMSK AH3VDPN3KT GVKnTTGSP mSHGKEL 1300 

ADQQCSOGAY DUICDECHS TDATSILGIG TVUDQAEEM3 ARLWIAEKT 1350 

PPGSVIVSHP NIEEVfcLSIT GEXFFKSAI PLEVIKGGRH UPCHSKKKC 1400 

DELAAKLVAL GINAVAYYPG LD7SVIPTSG DWWSIDAL MIGFTCDFDS 1450 

VIIXNICVTQ TVDFSLDPTF TlKl'l'l'LPQD AVSRTQRBGR T33GKPGIYR 1500 

FVAPGERP9G MFDSSVICEC YDAQCAWYEL TPAETIVRLR AYMNTPGLPV 1550 

CQEHLEFWEG VFIGLTHKA HFLSQflKQSG ENFPYLVZWQ ATVCAPAQAP 1600 

PPSWDQMMKC LZRLKPTLB3 PTPLLYFXGA VQNEVTL'IHP ITKYIMTCMS 1650 

ADLEWISTW VLVG3VLAAL AAYCLSIGCV VIVGRIVISG KPAIIPEREV 1700 

LYQEFDEMEE CSCHLPYIEQ GMMIAEQFKQ KAD2LLQEAS RHAEVTTPAV 1750 

CflNWCKLEVF WAKHMWNETS GIQYIAGLST LPGNPAIASL MAFTAAVTSP 1800 

LTIGQILLEN ILG3WAAQL AAPGAATAFV GAGLAGAAIG SVGK2KVLVD 1850 

ILACTGAGVA. GALVAFKEMS GEVPSIEDLV NLLPAILSPG ALWGWCAA 1900 

FIG. 4G 
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HPFHVGPGE GAVQKMNRLI AE&SFGSHVS PTHYVPESEft. AAFWTRH5S 1950 

LTVIQLLSFL H3WISSEdT PCSGSWLRDI WDWICEVLSD FKEWLKMCLM 2000 

PQLPGIPFVS OQR3XEGWWR ODGIMHIPCH OGAETIGHVK N3IMRIVGER 2050 

TCRCMrtSGIF PINZOTIGPC TPLPAENYKF ALWRVSAEEY VKLKKVGEEH 2100 

KCPOQIPSPE FPIELTGVRL HREAPPCKPL LREEVSFRM3 2150 

IHEXPVGSQL PCEPEPEHAV LTSML1DPSH nfcEAftGRFL ARGSPPSMAS 2200 

SSAS2LSAPS LKMCE\NHD SPEftELIEaN LLWRQEM33SI 2250 

VHDSFDPLV AEEDEREVSV PAEHRKSRR EARALPVWAR PD5flNPPLVET 2300 

WKKPEKEPFV VBXPLPPPR SPPVPPFKKK RIWLTESTL SIMAELfiHK 2350 

SP3SSST9GI ' 1ULM1T1S SE PAPSQCPPDS D7ESYSSMPP LEGEPC33PDL 2400 

SDGSWSIVSS GADIED7VOC SMSYSWIGAL VTPCAAEEQK LPINALSNSL 2450 

IPHHNLVYST TSESftCQPQK TS7TFD8LQJL DSHYQCWLKE VKAAASKVKA 2500 

NLLSVEE&CS LTPPHSAKSK HARKAVAHIN SVWKDLLEDS 2550 

VrPIDTITMA KNEVPCVQPE KQ3KPARLI VFPDU3VRVC EKMALYDWS 2600 

KLPIAVM3SS Y3FQY5FGQR VEFLVQAWKS KKTPM3FSYD TBCFDSTVIE 2650 

SDIRIEEAIY QGCDLDPQAR VAIKSLTERL YVQ3PLTNSR GEN33YRECR 2700 

ASGVLTTSCG NTLTCYIKAR AACRAAGLQD CIMLVOXDL WICESAGVQ 2750 

EDAASLRAFT EAMIRYSAPP GDPPQPEYDL ELTTSCSSNV SVAHDGftGKR 2800 

VYYLTRDPTT PLARAAWEIA RHTPVNSWLG NIIMFAPIIJW AFMUMHFF 2850 

SVLIARDQLE QAI2CEHYGA. CYSIEPLDLP PIIQFLHGLS AFSLHSYSFG 2900 

EINRVAACLR KLGVPPLRAW RHPARSVRAR LLiSEQGRAAI CEKYLFl^V 2950 

RTKLKLTPIA AAGRUDLSGW FIAGY9QGDI YHSV5HARPR WFWRZLLLLA 3000 

AGVGIYLLPN R 3011 

FIG. 4H 
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FIG. 12 



SUBSTITUTE SHEET (RULE 26) 



WO 99/04008 



23/49 



PCT/US98/14688 




FIG. 13 



SUBSTITUTE SHEET (RULE 26) 



WO 99/04008 



HC-J4 



PCI7US98/14688 



10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

GQCAGQOOQC TCATO3333C GACACTOCAC CA3GAATCAC TOOXTCIGA 50 

GGAACTACTG TCTICACGCA GAAAQ03TCT AQOCA2QQGG TEftGTftTCAG 100 

TCICGIQCAG OCTOCAQ3AC C0QG0CIC0C QQGAGAGQCA TAGIQGTCIG 150 

0GGAAO333T GAGTACAQ33 GAA3T30CAG GAOGADOQQG TOLTI'ILTIU 200 

GATCAACCQG CICAA3QQCT GGAGATTIGG GO3TO000QC QQGAGACIGC 250 

TAG03GAGTA GIGITQQC3IC GQGAAAQGOO TIGTQ3IACT GCCIGAIAQG 300 

GIQCTI003A GTOOXOQQG AQC3ICIOGIA GA003T0CAC CAIGAGCA03 350 

AATOCEftAAC CICAAAGAAA AADCAAAQ3T AACAQCAAOC GQOQQQCACA 400 

QGA03ICAAG TTQOOGQQOG GIQGICAGAT 03TIQ3TGGA GITEAOCTST 450 

TOQQGOGCAG G330GQCAGG TIQGSTGTQC QD30GACTAG GAAGGCTIOC 500 

GAGQ3GTCQC AADCIGGIQG AAQ303ACAA QCnMOQCAA AQGCIO3003 550 

ACCQGAQ3QC AGGGCCIGQG CICftGQOCQG GTAOXTIQG COOCICIMG 600 

GCAATGAGQ3 CCT33QGIQG QCAQGA3G9C TOCIGTCAOO 0390030100 650 

G3GCCTAGIT GGG3CCCCAC GGA0030033 G3TAQGTOGC GTAACTIOQG 700 

TAAQGICATC GA.TAOQCTEA. CAT33QQCIT O300SA3CTO ATQQGGTACA. 750 

TI0O30IO3T OQQCX3QQQOC CTAQQQQQ03 CIOOCAQQGC CTIGQCACAC 800 

GGKJID3333 TICIQ3AQG& O3G03IGAAC TA.T3CAACAG GGAACITCQC 850 

O3GTIQ0TCT TTCICTATOT T0CICIT3QC TCIGC'IGIOO TGTITCftOCA. 900 

TCCC&QCTIC OGCTEAT3AA GIQ0QCAAO3 T3T03333AT ATACCAIGTC 950 

AQGAAOGACT GCIOOAACTC AAQCATTCIG TA3GA0QCAG O3GA03TGAT 1000 

CAT3CAEACT Q0O3Q3T3Q3 TOOOCIGIGT TOAGGAGGGT AACAGCIOCC 1050 

GTIOCTGGGT AQG3CICACT 03303003 03300AQ3AA TO0CAQO3IC 1100 

COCACTA03A CAATAOGAQG CCA03TOGAC TIGCI03TIG GGAQ33CTGC 1150 

TTICICCTOC QCIAT3TAOG T3QQ33ATCT CIQQQ3ATCT ATITTOCTCG 1200 

TCIOOCAGCT GTTCAQCITC T03CCTOGGC G3CAIGAGAC AGIGCAQGAC 1250 

TOCAACTGCT CAATCTATOC CGQOCATCLA TCAQ3TCAQC GCAJG3CTIG 1300 

GGATATGAT3 ATGAACTQ3T CAOOIACAAC AGOQCEftGIG GIGT09CAGT 1350 

TGCTC33GAT CQCACAAQCT GTCOIQ3ACA T3STG30Q33 G300CACTQ3 1400 

GGAGTOCTGG O3QQ0CTIQC CTACIATIOO ATQGTAG33A ACTG33CTAA 1450 

GGTIOIGATT GTGGQGOTAC TCITIQ30GG 03IT3AGG33 GAGAQOCACA 1500 

CGAQQGQGAG 0310300300 CACAOCAQCT C0QQ3TICAC GTOOJITI'IC 1550 

TCATCIGGGG Q3TCICAGAA AATOOAGOTT GIGAATAOCA A03QCAGCIO 1600 

GCACATCAAC AQGACTQOOC TAAATTGCAA T3ACI00CIC CAAACTO33T 1650 

TCrnOOOQC GCIGTITEAC G3ACACAAGT TCAACTOGTC O33GTO000G 1700 

GAGOGCATGG OCAGOTQOOG CQOCATIGAC T3GTIQQ00C AGG33TGQGG 1750 

OOCCATOACC TATACEAAGC CTAACAQCTC GSATCAGAQG OCTEATIOOT 1800 

GGCATTAQQC QCCIO3AC03 TGTO3IGTQG TAQ0O333TC GCAQGIGIGT 1850 

GGTOCAGIGT AT1U1T1C AC OOCAAQOOOT GTIGI03TQS GGAOOAOOGA 1900 
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TCGITOOQSr GTCXETAOGT AT&GCIG3GG QGAGAMGAG ACftGfi03I Gft. 1950 

IGL'lUCICftA. CAACAOQ0GT <X0CX3O^G GCZAACIGGTT CXXXIGTACA 2000 

TGGA.TGAAIA. GIACIQ33IT CACIAAGAOG TO033ftQ3IC UULUALUJaA 2050 

CATOQ3333G GTO33IAAQC QCAOCITGAT 010X0303 GACTXTICC 2100 

GGAAQCAOOC Q3A3XTACT TftCACAAAKT CTQ 33Q3 3G GXCIG3TIG 2150 

ACAOCiaaSP G03&GTAGA CT&Q32AIAC AOXTITOQC ACTROXCIG 2200 

CACICICftBT TTTPXK ICr TTft&QSTIAG GftllUlimUIG 0033303103 2250 

AGCZOQ3CT CAKD3XGCA T3C3^KTIQSA CIOGftGSRGA GUGCT31MC 2300 

TIQ3AG3ACA G33ATAQ3IC AGAACIGAGC OO3CIG0IQC TSICEOAC 2350 

AGACTQ3CAG ATACT3XCT GIGCTTICAC CACOCTAOOG Q CTITKIQC A. 2400 

CiGG TTIG ftT (XATCICCAT CAGAACATO3 TQGACGIQCA. ATAiXTGTfiC 2450 

QGTCT&GGGT CftQO STTIGT CTOCTTIQCA. ATCAAASG33 AG TACAIOCT 2500 

GTIGCTnTC C T1CTO IQ3 CAGA03XCG 03IGIGIGQC 1QCITGIGGA 2550 

TCAIGCIGCT GATftQCOCAG GCIGAG3XG CCTEAGAGAA CIT33IG3IC 2600 

CICAAIGGGS 0GICCGIO3C 0Q3AGO3GAT G3TATICICT CCTITCTIGT 2650 

Gn d TCIGC QCCGOCIGOT ACATTAA033 CAGQCIQ3CT OCT33330QS 2700 

CGTEMGCITT TIAICGOSTA TGCXOSCIGC TXTOCIOCT ACTOXCTIA 2750 

CCAGCACGAG CTIACGXTT QGADGGOGAG A1Q3CIQCAT CCT3Q3GGGG 2800 

IGJG GTIdT GTAQGTCIOG TATICTIGAC CITGTCAOCA TAC3ACAAAG 2850 

1GITICICAC TAQQCICATA T03TG3TTAC AATACTTEAT CACCAGAGQC 2900 

GAQGCQCACA TXXAAGT3IG GGIOXCCCC CTCAAGGTIC QGQ GftQXCG 2950 

CGATGXATC ATOCICCICA CGICT3CX3GT TCATCCAGAG TEAATITITG 3000 

ACATCACCAA ACIOTGCTC QCCATACTOG GXOXTCAT GGIGCIDCAG 3050 

GCIQXATAA CGLAGAGIGCC GTACTIQGTG CG33CTCAAG GGCICATIOG 3100 

IQCMGCAIG TIAGTGCGAA AAGTD3QQGG GGGTCATIAT GTUZAAATGG 3150 

TCITCAIGAA. GCIG03QGQ3 CIGACAGXA 03EAGGTTIA TAACCATCTT 3200 

XXXTACIQC G3GACK33X CTAOX333C CIAQSAGACC TIOC33GIQ3C 3250 

GGTAGAQXC GTOGTCTICT CGGOCAIGGA. GAXAAQGIC ATCACCIQQG 3300 

GA3CAGACAC CGCIGCGIGT 0333OTCA TCTIG33ICT ACraSICIOC 3350 

GCCCGAAGG3 GGAAGGAGAT ATTTTIGGGA CCGGCTGAEA. GIUTCGAAGG 3400 

GCAAQ33IGG GGACTXTTC GGOCCAICAC GGGCDOCC CAACAAAOGC 3450 

GOXCGlACr TOGTIQCATC ATCACIAGCC TCACAGG003 GGACA&GAAC 3500 

CAGGTCGAAG GGGAGGTICA. AGTOXTICr ACQGCAACAC AATCITIDCT 3550 

G3GGAGCIGC ATCAAGGGOG TGIGCIGGAC T3ICTACCAT GGOOCIQQCT 3600 

CGAA3AOCCT AGCCGGTCCA AAAGGIDCAA TCADCCAAAT GTACAQCAAT 3650 

GTAGACCT33 ACCTOGTCG3 CTCQCAGGCG QXCCGQQ3G 03GQCTXAT 3700 

GACACCATOC AGCIGTXCA GCTOGGACCT TEACTIGGIC ACGAGACATC 3750 

CIGATGICAT TCGGGTGCX CGGOGLAQGCG ACAGCAGGQG AAGTUTACTC 3800 
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TO00QCAG3C OCXSICIOCm CCIGAAAG3C TOCI033SIG GIC CA TIGC T 3850 

TIQCXOTCE G33CA03KG TQ3305ICrr Q3QGQC7IGCT GiUlUCAOCE 3900 

033333IOX Q\AG333GIG GALTlCATftC OUl'lGAGIC TBIGGAAACT 3950 

CTOO OSICIT CACAGACAAC TCAAO3003C 033CIGTAOC 4000 

Q2AGACATIC CAAGTQ3ZAC AICIGCA03C 1DCnOQ32 AGZG3CZAAGA 4050 

Q2AOCAAAGT GOOGGCIGOS TASQCAGCCC AAQGGIECAA. QSIQCTOSIC 4100 

cigaacoogt umriGGoac cACcnaoas Tnooaaasr ataigtgcaa 4i50 

GQCACA033T ATOGAOCTA ACATCAGAAC T3333EAAQ3 AOGATB£EA 4200 

G0G3033CIC CATC03IAC 10CAOCTAIG GCAAGTICCT TQ33GAQQGT 4250 

GG CIG n CIG GGGGOQQ M'A T3ACAICAIA ATATCIGAIG AG1ULIJACTC 4300 

AACIGACIOG ACC^IXAICT T33XAID33 C30G1011G GAOCAAG0Q3 4350 

AGAQ33CTGG AG0GO33CIC GT0GIGCIO3 0CAG03CIAC A0CTO03GGA 4400 

TOGGirAOOG TSXACAOOC CAATATO3AG GAAATAGGOC TGIOCAACAA 4450 

T3GAGAGATC OG L ' l ' lC T A TC GCAAAGQCAT CCOCATIGAG GQCATCAAGG 4500 

GG33GAG3CA TCICATITIC T30CATICCA AGAAGAAA1G TGAOGAGCIC 4550 

GCOQCAAAGC TGACAGQQCT OGGACIGAAC GCIGIAGCAT ATI?ODQ333 4600 

OCTIGAIGIG TCGGTCAIAC OGOCIATCGG AGAOGIOGTT GTOGIGGCAA. 4650 

CAGAQ3CTGT AATCA03QGT TICAC033CG ATTTTGACTC AGIGATCGAC 4700 

TOCAATACAT GTGTGACQCA GACAGTOGAC TIGAGCTIQG ATCQCADCTT 4750 

CAOCATIGAG ACGA0GA003 TGCCQCAAGA CGOQGTGIOG 03CIGGCAAC 4800 

QQOGAGGIAG AACIQQCAGG GGTAGGAGTG GCAIUEACAG GTTIGTGACT 4850 

GCAQ3AGAAC GGGGCIOQQS CAIGTID3AT TCITOGGTGC T3IGTGAGIG 4900 

CTA1GACQCG QG CIGIGCIT GGTAIGAGCT CAQQCOOGCT GAGAQCJI033 4950 

TTAGGTIGaS G3CTIA0CIA AATACACCAG Q3ITGG003T CIGOCAGGAC 5000 

CATCIQGAGT TCIGGGAGAG OGICTICACA Q30CICACCC ACATAGAT3C 5050 

QCACTTOCIG TCCCAGACIA AACAG3CAGG AGACAACITT OCTIADZEGG 5100 

TQGCAIA1CA AGCTACAGIG TQ0QGCAQ33 CICAAQCTDC AQCTCCATOG 5150 

TQQGACCAAA T3TGGAAGTG TCICAIAD33 CIGAAADCIA CACTGCACQG 5200 

GOCAACAOQC CTGCIGTAIA GQCEAGGAGC OGIDZAAAAT GAGGTCATCC 5250 

TCACACACCC CAIAACEAAA TACATCA1GG CAIGCATOIC GOCTGAOCIG 5300 

GAQS1CGICA CEAGCAOCTG QGIGCIG3TA GQQQGAGUX TIGCAGCITT 5350 

GQQCQCAIAC TOCCIGAGGA CAQ3CAGTGT GGTCATIGIG GGCAGGAICA 5400 

1CTIGIC0Q3 GAAQQCAGCT GICGTICOQG ACAGGGAAGT OCICTAOCAG 5450 

GAGTICGA3G AGA3GGAAGA GTGIGQCTCA CAACITOCTT ACATCGAQCA 5500 

03GAAIQCAG CTQQGCGAGC AATTCAAGCA AAAGG03CIC GGGTIGTIGC 5550 

AAACQGCCAC CAAGCAAGCG GAQGCTGCIG CTCCOGIGGT GGAGTCCAAG 5600 

TOGCGAGCCC TIGAGAOCTT CIG3XGAAG CACATGTG3A ATTICAICAG 5650 

OQGAATACAG TACCTAGCAG QCITATCCAC TCIGOCIG3A AAOQOQQOGA 5700 
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GAIGOCATIT ACAQLTIL'IA TCACTAGOCC GZECACEACC 5750 

CAAAACACCC TOC'lUri'lAA CATCTIQQ33 GGA1G33IGG CTCTCCAACT 5800 

CGCIDCnrC AGCGCKCCT CAOJl'l'lULJi' QQQCX3CD3GC A3CGOCGGAG 5850 

CQGCIGTIGG CAGCAIAGGC CTIGGGAAQ3 TQCTCGIGGA CAICTTO30G 5900 

GXTATG33G CAG333IAX CX23CGCACTC GTO3QCTTIA AQ3ICA1GAG 5950 

OQQCGAGGIG CUCVLaCAQCG AQ3AOCTQGT OJOTACIC ai'lUmJATOC 6000 

' IL ' IL ' ICC I GG TGOJCT3GIC GTO3GGGTCG '1UIGGGCAGC AATACTUGGJi' 6050 

GQQZA0GIG3 QCD3333AGA GQ393CIGIG CAGTGGAIGA ALUAJL'lUAT 6100 

AQOGITCECT TCQ0333GIA AOZACGTCTC COEE03CAC 1MGIG0CIG 6150 

AGAG3GMGC T3CAQ2A03T GICACICAGA TOCJICICEAG OCTEAQCATC 6200 

JOCAACIGC TCAAG03XT OCADCAGIG3 ATEAATCAGG ACIGL'IL'IJAC 6250 

QQCATOCIOC G3CTOGTGGC TAAGGGATCT TIGGGATIGG Al&lGCAOGG 6300 

TSTIGACTGA CTTCAAGACC TGGCTCCAGT CCAAACICCT GGQ33GGTIA 6350 

O03QGAGTCC CTTTCCIGTC ATCCCAACGC OGGTACAAGG GAGTCIQQOG 6400 

GSGGGAQQQC ATCATCCAAA CCACCTGCCC ATGOGGAGCA CAGATCGGCG 6450 

GACATCTCAA AAACGGTTCC ATCAGGATCG TAGGGGCTAG AAQCTGCAQC 6500 

AACACGTGGC AGGGAAGGTT CCOCATCAAC GCAIACACCA CGGGACCTIG 6550 

CACACQCTCX: COGGOGOCCA ACIATTGCAG GGOGCIAIGG GGGGTGGCTG 6600 . 

CIGAGGAGTA CGIGGAGGTT AOGGGTGTGG QGGATCTCCA CTAGGTGACG 6650 

GGCATGAQCA CTGACAACGT AAAGIGCCCA TGCCAGGTTC OGGOCCQCGA 6700 

ATTCITCACG GAGGTGGATC GAGIGOQGTT GCACAGGTAC GCTCOGGOGT 6750 

GCAAACCTCT TCTAGGGGAG GACGTCACGT TCCAQGTCGG GCTCAAQCAA 6800 

TAL Tl ULa lL G GGTCGCAGCT CCCATGCGAG CGCGAACCGG ADGTAACAGT 6850 

GCTEACTICC ATCCTCACCG ATCCCTCCCA CATIACAGCA GAGAQQQCTA 6900 

AGCGTAQGCT GGCTAGAGGG TCTCCCCCCT CTTTAQQCAG CTCA1CAGCT 6950 

AGCCAGTTGT CTGCGCCTIC TITSAAGQOG ACATGCACTA CXXACCATCA 7000 

CTCCGCGGAC QCIGAQCTCA TCGAQGCCAA CCTCTTCTGG QQQCAGGAGA 7050 

TOGGQQGAAA (ZATCACTOGC GIQGAGTCAG AGAATAAGGT AGTAATICTC 7100 

GA L ' l L' r i' lL G AACGGCTICA CG03GAG333 GATCAGAGGG AGAIATCQGT 7150 

GGCGGCGGAG ATCCTOCGAA AATCGAQGAA GTIOOCCTCA GOGTIGCGCA 7200 

TATGGGCACG CGOQGACTAC AATCCTGCAC TGCTAGAGTC CTGGAAGGAC 7250 

CCGGACTACG TCCCTCCGGT GGTACACGGA TGCCCATIGC CAQCTACCAA 7300 

GQCTCCTCCA ATACCACCTC CACGGAGAAA GAGGACGGTT GICCIGACAG 7350 

AATCXZAAIUT GTCITCTGOC TIGGCGGAGC TC3CCACTAA GACCTTCQGT 7400 

AGCTCOGGAT CGTCGGOCGT TGATAGOGGC AQGGOGAOGG CCCTIOCTGA 7450 

CCTCQCCTCC GAQGACGGTG ACAAAGGATC CGAGGTIGAG TCGTACTCCT 7500 

CCATGCCCOC CCTTGAAGGG GAGOCGGGGG ACCGCGATCT CAQQGA03QG . 7550 

COGTGAGTGA GGAGGCTAGT GAGGA3GTOG TCTGCTGCTC 7600 
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AA1U1UCTAT ACULUGACAG Q03QQCIGAT CAOGOGA1GC GCIQOQGAGG 7650 

AAAGIAAGCr GOCCATCAAC COJl'lLiftGCA. AL1U1T1UCT GOGICAOCftC 7700 

AACATOGICT AOGOCACAAC AlOOOGGAQC GCAAGOCIOC GGCAGAAGAA 7750 

Q3ICA0CTTT GACAGATICC AAUIUL'IGGA TGA3CATIAC GGGGAOGTAC 7800 

TCAAQGAGAT GAAGGOGAAG G03ICQOG TIAAQGCrEAA QLTlUJmLT 7850 

AIAGAGGAGG OCHOCAAGCT GACX3CXXXXA CATIUGGOIA AATOGAAATT 7900 

1QQCEKIQ3G GCAAAQGAQG T00GGAAOCT JOOCAQCAGG QLUJl'lAACE 7950 

AemaoGCic osigtgqgag GALTIUL'IGG AAGACACTGA AACACEAATr 8000 

GACAOCAOCA TCAIQQCAAA AAGTGAGGTT TICIGOGTOC AAQGAGAGAA 8050 

GQGAQGQOQC AAGXAGCIC QLUITAILUI' ATTOCCAGAC CIQGGAGTIC 8100 

GIGTATOCGA GAAGATOGOC: C3TEACGAD3 TQ3IUICCAC OJI'ILUIUAG 8150 

QQQCTGAIQG GCICCICAIA CQGATnCAA TACTCQOOCA AGCAGOQGGT 8200 

CGAGITCCIG GIGAAIACCT GGAAATCAAA GAAASQOOCT ATO33LT1LT 8250 

CATATGACAC GQQCIGTTIT GACICAAQ3G TCACIGAGAG TGACATICGT 8300 

GTIGAQGAGT CAATITAGCA ATGTIGTGAC TT3Q0GGQ0G AQGOCAGACA 8350 

Q3CCATAAG3 TCGCICACAG AGOGGCTITA CA20333GGT GXC'IGACEA 8400 

ACICAAAAGG GCAGAACIGC GCHTAT33ZC GGTDGQOQOGC AAGIGGCGIG 8450 

CIGAGGACTA GCTGQGGIAA TACCCICACA TGTTACTIGA AGXCACIQC 8500 

AGCCTGTOGA GZTGCAAAGC TOCAGGACTC CAQGAIGCTC GIGAAQQGAG 8550 

AOGACCT1G T aJTTATCIGT GAAAQQQ03G GAADOCAGGA QGATO33303 8600 

QCCCTAQGAG (XTICAQGGA GGCEA1GACT AGGIATIOQG CQGOD3CQGG 8650 

GGATCCQCCC CAAOCAGAAT A03ACCT33A GCIGATAACA TCATGTKTT 8700 

CCAAJIGIGIC AGICGOGCAC GA3GCATCIG GCAAAAG33T ATACHAOCTC 8750 

ACCCGTGACC CCAOCADXC CCITOCA033 GCTGOGIQGG AGACAQCTAG 8800 

ACACACTCCA ATCAACICTT QQCIAQGCAA TAICAICATC TATQOGQQCA 8850 

GCCTATOGGC AAGGAIGATT CIGAIGACIC ALTI'ITIUIC CATCCTICEA 8900 

GCICAAGAGC AACTIGAAAA AGCCCIGGAT TGTCAGATCT A033GGCTIG 8950 

CTACICCATT GAGOCACTIG ACCTACCTCA GAJCATIGAA CGACIDCAT3 9000 

GICITAGCQC ATTTACACrC CACAGTEACT CTDCAGGIGA GAICAATAGG 9050 

GIG3CTICAT GCCICAGGAA ACTIGGGGTA CCACGCTIGC GAACCIGGAG 9100 

ACATCGGXC AGAAGIGTCC GCGCTAAGCT ACIGTO3ZAG QGG33GAGQG 9150 

CCGCCACTIG TGGCAGATAC CICITEAACT GGGCAGTAAG GACCAAQdT 9200 

AAACTCACTC CAAICOOGGC OGCGTCOCAG CTGGACTTGT CTGGCIOGTT 9250 

OGTOQCIGGT TACAG0G3GG GAGACATATA TCACAGOCIG TCIO3IGG0C 9300 

GACXXQGCIG GITItXGTTG TGQCTACIQC TACTTICIGT AGGGGTAGQC 9350 

ATTTACCIGC TCCQCAACOG A1GAAQG3GG AGCTAACCAC TOCAQGCCTT 9400 

AAGOCATTTC CTGITTTTIT TTITITrTTT TlTmTlTr TCrmTTIT 9450 

Trrcrnccr ricrriuiTT ttticcttic imrcccrr ctttaatggt 9500 
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GGCICCftTCT TAGGOCTAGT CAGQGCiaQC TCT3AAAG3T COSIGftQOOG 9550 
CftTCACIGCA GftGAGIQCIG ATACIQ3XT CICIQCAGAT CAIGT 9595 
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MSINEKPCPK TKFMNRRPQ D^KFPSGGQI VG3VYLLPRR GPRLGVRAIR 50 

KASERSQPBG ERQPIPKAFR PB3RAWAQPG YPWPLYGNEE Ii3WftGWLL£P 100 

K3SRPSWGET DPRRRSRNL3 KVIIIELTOGF AEQGXIPLV GAPLGGAARA 150 

LAH3VRVLED GVNYAlGiQLP GCSFSIEIIA IiSCLTTPAS AYEVRNVSGI 200 

YWINDCSNS SIVYEAAD7I MHTPQCVPCV QB3NSSRCWV" ALTPTLAARN 250 

AS VFT1T1K R HVHIM3EAA PCSAMXVGEL 03SIFLVSQL FTFSPRRHET 300 

VQDCNCSIYP GHV9GHFMAW rWMMNWSPIT ALWSQLLPI PQAVVIM/AG 350 

AHtf3VLAGLA YYSMVGNWMC VIJVALLEAG VDCEIHTIGR VAGHTISGFT 400 

SLFSSGASQK IQLVNUSCSW HINFTMJOJ DSLQIGFEAA. LFYAHKFNSS 450 

GCPERMASCR PHDHFAQOWG P1IYIKPNSS D2RPYCWHYA PRPOGWPAS 500 

QVOSPVYCFT PSPWVGTID RSGVPIYSWG ENEHWMLLN NIRPPQGNWF 550 

GCIVMNSIGF TCTGGGPFCN IGGVGNRELI CPlUJb'KKHP EA1YIR3GSG 600 

PWLTPRCLVD YPYRIMKPC TL2SFSIEKVR MYVGGVEHEL NAAQ3WIH3E 650 

RCNLECKERS ELS PLLLS TT EWQILPCAFT TLPALSItXI HLH^UVWQ 700 

YLYGVGSAFV SFAIKWEXIL II.FLIIADAR VCACLJftMMLL IAQAEAALEN 750 

LWLNAASVA GAH3ILSFLV FFCAAWYIKG KLAPGAAYAF YGVWPLLLLL 800 

LALPPRAYAL DREMAASCGG AVLVGLVFLT LSFYYKVFLT RLIWWLQXFI 850 

TRAEAHMJVW VPPUWRGGR DAIILLTCAV HPEf.iTFDITK Li-IAILGELM 900 

VDQAGZTRVP YFVRAQGLIR ACMLVRKVAG GHWGMVFMK LGALTCIWY 950 

KHLTPLREWA HAGLRDLAVA VEPWFSAME TKVTIWGADT AACGDIHGL 1000 

PVSARRGKEI FIGPAD5LEE QQWRLLAPIT AYSQQIFGVL GCIITSUIGR. 1050 

EKNQVEGEyQ WSIMQSFL ATCINGVCWT VYH3AGSKIL AGPKGPTICM 1100 

YTNVDLDLVG WQAPPGARSM TPCSOGSSDL YLVTRHAEVI FVRPEGDSRG 1150 

SLLSPRPVSY LKGS933PLL CPSGHWGVF RAAVCTFGVA KAVDFTPVES 1200 

METTO1RSEVF UNSTPPAV? QTFQVAHLHA PTGSC3<SIKV PAAYAAQGYK 1250 

VLVLNPSVAA TLGPGAYMSK AH3IDFNIPT GVKT1T1U3S ITYSIYGKFL 1300 

ADGGCSGGAY DIIICDECHS TD5ITCLGIG TVLDQAETAG ARLWLATAT 1350 

PPGSVIVPHP NIEEIGLSNN GEIPFYGKAI PIEAIKGGRH LIFCHSKKKC 1400 

DELAAKLTGL GLNAVAYYPG UWSVIPPIG DWWA1DAL MIGFTGDFDS 1450 

VIDCNICVIQ TVDFSLDPTF TJLE1T1VPQD AVSRSQFPGR TGRGR9GIYR 1500 

FVTPGERPSG MFDSSVDGEC YDAGCAWYEL TPAETSVRLR AYLNTPGLPV 1550 

CQLHLEFWES WIGLTHIDA HFLSOTKQAG ENFPYLVAYQ AWCAPAQAP 1600 

PPSWLO^KC KERLKPTLKG PTPLLYRLGA VCNEVILTHP ITKYIMACMS 1650 

ADLEWTSIW VLVGGVLAAL AAYCLTTGSV VTVGRIIL9G KPAWPEKEV 1700 

LYQEFDEMEE CASQLPYIEQ GMQLABQFKQ KATGTIQIAT KQAEAAAPW 1750 

ESKWRALETF WAKHMa!NFTS GIQYLAGLST LPGNPAIASL MAFTASITSP 1800 

T TTnKTTT ,T PN? IIGGWVAAQL APPSAASAFV GAGIAGAAVG SIGU3KVLVD 1850 

ILAGYGAGVA. GALVAFKVKS GEVPSIEDLV NLLPAILSPG ALWGWCAA 1900 
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ILRRHVGEGE GAVQWMNRLI AE&SKGNHVS PTHYVPESEft. AARVIQUSS 1950 

LTTIQLLKRL HSWINEDCST PC93SWLRD7 WDWICIVUID FK3WLQSKLL 2000 

PBLPGVPFLS QQRGYKGVWR GOGIMJl'lUP 03AQIAGHVK N3&JKEVGPR 2050 

TCSNIWB3TF PINAYITGPC TPSPAFNYSR ALJWRVAAEEY VEVlKVLUbH 2100 

YVTaCHN7 KCPOQVPAPE FFTEVDGVFL HKYAPACKPL LPED7TPQVG 2150 

LN2YLVGSQL FCEPEPD7IV LTSMLTDPSH HSffilftKRRL ARGSPPSLAS 2200 

SSASQLSAPS LKft liLTJLH HD SPDADLJEAN LIWRQEM32J IIFWESENKV 2250 

VILDSFEPIH AB3 EREISV AAEUKCSRK FPSALPIWAR PIKNPPLLES 2300 

WKDPDWPPV VH3ZPLPPTK APPIPFPKRK RIWL1E3^7 SSAIAELA3K 2350 

TFX3S9GSSAV D0CT&3WLTD LASEDXKGS DVESYSSMPP TM4KKJJtflJL 2400 

SDGSWSIVSE EASETWVOCS MSYIWIGALI TPCAAEESKL PINPLSNSLL 2450 

RHHNMWAXT SRSASIJ^KK VTETBLQJLD EtGPTWLKEM KAKASIVKAK 2500 

LLSIEEACKL TPPHSAKSKF GWSKEWRNL SSPAVNHIRS VWEEXIEDTE 2550 

TPlUl'l'iMAK SEVPCVQPEK QGRKPARLIV FPDUSVRVCE KMALYEWST 2600 

LPQAVM3SSY GPQYSPKQFV EFLVNIWKSK KCHCFSYDT PTFDSIVIES 2650 

DIRVEESIYQ Q3XAPEARQ AIRSLTERLY IG3PLTNSKG 2700 

9GVLTISCX2<J TLTCYLKATA ACRAAKLQDC TMLVN3XLV VICESAGIQE 2750 

DAAALKAFIE AMTFYSAPPG DPFQPEYDLE LTTSCSSNVS VAHCASOCRV 2800 

YYL1RDPTTP IAFAAWEI&R HTPINSWLJ3SI IIMXAP3IJWA KMELMIHFFS 2850 

HLAQEQLEK YSIEPIDLPQ IIERLH3LSA FILHSYSFGE 2900 

INRVASCLRK DGMPPLKIWR HRARSVRAKL LSQ33RAATC GKYLFIWAVR 2950 

TKLKLTPIPA ASQLDL9GWF VAGYSG3DIY HSLSRARPRW FPDZLLLLSV 3000 
GVGIYLLPNR 

FIG. I4H 
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#2. Strategy for constructing chimeric clone of HCV (pH77CV-J4) 
which contains the nonstructural region of strain H77 and the 
structural region of strain HC-J4 



5'UTR 

1 



delete 



pCV-H77C 



_ U**tt************4******f t»»»i 

C+EUE2+P7 ♦ N S3 4 NS4 A-*- NS4B ♦ NS5 A + NS5B ► 



3'UTR 



1 — ssn^ 1 p 

a ' . rl i f7if» i &o47 II! (blunt end; 2851) Hind\\\ Hd*±Afl II 
>4ge I (156) C/a I (/TO) t (7862) (9160 ) {94 o3) 



pCfl products 



GCA TAC GCA 
r& TAT fi.. 

Nde I # (2763) 

->C/a lAfcto I Fco 47 III Fusion PCff 
A 



(7862) (9160) (9403) 
gcC ATA TGt 
gcT ATA TGt 



H7851S— - ♦ 




H9140S(P-M) 



H275 1 S (Clal/Ndel) H2870R 
Xho l(22B2) 



pCV-J4L6S 



Nde I * (2763) 
1 D 



J4-2271S 



J4-2776R(Ndel) 



1. Fragment A, B t C and D ; PCR amplification from pCV-H77C or pCV-J4L6S 

• Fragment A ; additional C/a I site, artificial Nde I site induced by a single mutation 
(C->T at nt 2765 of H77C) and authentic £co47 III site 

• Fragment B and C ; eliminated Nde I site by a single mutation within the primers 
(C~>T at nt 9158 of H77C) f and fusion PCR with both fragments 

• Fragment D ; artificial Nde I site induced by 2 point mutations within the primer 
(T-»A at nt 2762 and C-VT at nt 2765 of J4L6S) 

2. TA cloning of PCR products 

3. Sequence analysis 

4. Cloning of Fragment A (C/a l-Eco 47III ) and Fragment B/C (Hind lll-Af/ II ) with correct 
sequence into pCV-H77C 

5. Complete sequence analysis of new cassette vector [pH77QVl, into which the structural 
regions of different genotypes can be inserted. 

6. Cloning of Fragment-Age MXho I (cut out from pCV-J4L6S) and Fragment D (Xho \-Nde I 
with correct sequence into the new cassette vector ; 3 piece ligation 

7. Complete sequence analysis of 1 a+1 b chimera [ gH77CV-J41 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. 15 
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GCCAQXCCC TGKIG333QC GACACroCAC CA1GAA1CAC T QO0CIGIG A 50 

Q3AACTACIG TCTICA03CA GAAAG03TCT AGOCATO303 TEAGIAIGAG 100 

TGTOGIGCAG CCKXZAGGAC LUJCC CTCCC GGGAGAGQZA TA G I GJll ' lU ISO 

033AACCX33T GAGTACAODG GAATIGaCAG GACGAO033G 'lLLTl'lLTlU 200 

GAICAA0333 CICAATOOCT GGAGATTTGG GCGIQOOOCC QCGAGACTGC 250 

TAGOOGAGEA GTGTT333TC G03AAAGGOC TIGIGGTACT GGCTGKEAGG 300 

GIULT IGO G A. (J1UJ 0 CO3 35 AUU1CIU3IA GAUCUIGCAC CAIGAGCACG 350 

AA10CTAAAC CTCAAAGAAA AAJXAAAGGT AACAQCAAOC GOT3QOCACA 400 

GGA03TCAAG 'l'jm33333G GTGGTCAGAT UJriUUlULA GTITROCIGr 450 

TCQ030GCAG GGGOOGCAGG TlUJJiUlUC GOGOGACTAG GAAG3CTIOC 500 

GAQLUJIOGC: AACL'IUG'IG G AAGGOGACAA OCTATCQCAA AG3TICX33D3 550 

ACC03ftG33C AGGGOCIGGS CECAGOCD3G GTALLLT1UG COQCICTATC 600 

GCAA1GAQ33 CCT3333IG3 GCAGGAIQ3C TCCIGTCACC GOGOGGCiac 650 

CQ33ZTAGIT G333CODCAC GGAQG0QG33 QGTAGGTD3C GTAPCTIQ33 700 

TAAGGTCATC GAIACQCTIA CATO033TIT OXD3A3CIC A1GGSGTACA 750 

'l'lULUC'ICGT GGGQQGGGCC CEAG333333 CIUXA033C CTTGGCACAC 800 

GGIUTGQ3GG T1L1G GAQGA CGOCGTGAAC TAIGCAACAG G3AACTTGCC 850 

COJl'lU-'lLT TICICTA3CT TOCICnQX 'IL'iUL'lUlUC 'iUlTlGAOCA 900 

TCCCAGCTIC QQCTEATGAA GIGCQCAAOG TGTICQ3GGAT ATACCAIGTC 950 

AQGAAOGACT GCIGCAACTC AAQCATIGTG TAIGAGGCAG CQ3ACGIGAT 1000 

CATOCATACT GQ0Q3GTB3CG T303CIGTGT TCAGGAG33T AACAGCTOX 1050 

Gr i ai'iaxrr agqgcicact cocacgcice cggocaggaa tgocagogtc noo 

CCCACTAGGA CAATAOGACG QCADTTOGAC TiULlLUl'lU GGA03QCIQC 1150 

Tl ' lL ' lUL ' lL C GCTAIGTAGG TGGQ3GA1CT CIG033AICT ATTnOCIOS 1200 

TCTCCCAQCT GTIGAGCTIC TO3CCTO30C Q3CA1GAGAC AGTGCAQGAC 1250 

TOCAACTOCT CAAICEAICC CG30CAIGTA TCAGGTCACC GCA1G3CTIG 1300 

. G3AIAIGAIG AIGAALTGGT CACCEACAAC AG3DTEAGIG GTGT03CAGT D50 

TGCID033AT OOCACAAOT GTGGIGGACA 1UJ1GXGGG GXCCACTGG 1400 

GGAGTCTIGG UJJJ -L T1U. CTACEATia: AIGGTAG33A ACTGGGCTAA 1450 

GGTTCTGATT GTG303CTAC 'ILT l 'lULC Go CGTIGAGG33 GAGAOCTACA 1500 

CGA03333AG GGTGG333GC CACACCACCT CCG3GTTCAC GTOXTTTTC 1550 

TCATCTQ333 OGTECTCAGAA AATOCAGCTT GTGAATAGCA AOQ3CAGCIG 1600 

GCACAICAAC AQGACT3QQC TAAAITQCAA TGAL'lUX'lC CAAAC'IGGGT 1650 

TCTTTGOQX GCIGTTTEAC GCACACAAGT TCAACTQGTC 0333TG0033 1700 

GAGOGCAIGG CCAGCTGOG3 CQQCATTGAC TGGTTOQOOC AGGGGT33GG 1750 

CCCCAICACC TATACTAAGC CTAACAGCTC GGAICAGAGG CCTEATIGCT 1800 

FIG. I6A 
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GGCATIACGC QOCiaSAOOG IGIQ3IGI03 TAOOOG03IC GGAGGTGIGT 1850 

GGTQCAGTGT ATIGTTICAC GGCAAGOOCT GTIGIGGTGG C3Mim 1900 

'imriuum r gillltacgt ataojilllg ggagaatcag acagaggiga 1950 

TGC'ICCICAA CAACACGOGT GGGGGACAAG GCAftCIQGTT OGGCIGTACA 2000 

TGGATGAAIA GTRCIGGG TT CACEAAGAGG TOOGGAGGTC COGGG1U1KA 2050 

CA.1D333333 GlUJLxUAACG QCAOCTIGAT CIGOOOCACG GRL'iULTlUL' 2100 

G3AAGCA03C GGAGGCTACT TACACAAAAT GflGGCTOGGG GGGCIGGTIG 2150 

ACAOTEAG3T GGCEAGTAGA CIPDXSmC AGGCTTIGGC ACTAQQCCTG 2200 

CA L ' IUICA AT TlTlLLmCT TTkPGJTTPG GA1GTAIGTG GGGGGOGTGG 2250 

AGCACAGGCT CAA1GG0GGA TOCAATIGGA. CTOGAGGAGA G0GC1UIAAC 2300 

TIGGAGGACA GGGATAGGTC AGAACICAQC GOGCIGCIGC TGTCEACAAC 2350 

AGAGIGGGAG ATACIGCQCT GIGGTTICAC CAGOCTftCOG GCITTA2CCA 2400 

CIGGTTIGAT CCAICTOCAT CAGAACA3GG IGGA03IGCA. ATPOJ1UEAC 2450 

GGIGTAGGGT CAGGGTnGT CICCTTIGCA. AICAAAIGG3 ACTTCATCCT 2500 

GTIGCnTIC CTTCIGCIGG CAGADGOGOG OGIGTGTGOC T3CTIGIGGA 2550 

TGAIGCTGCT GATAGOOGAG GCTGAGGGOG CCTTAGAGAA. CFIGGTQGTC 2600 

CICAAIGOGG OGIGGGIGGC CGGAGOGCAT GGTATICTUT CCTITCTIGT 2650 

GTICTICIGG QJQ3CCTGGT ACATEAAGGG CAGGCIGGCT GCIGGGGGGG 2700 

GGEA3GCTIT TEATGGGGTA TGGOQQCTGC TCCTGCTOCT ACIGGOGTTA 2750 

CCADCAGGAG CATAIGCACT GGACAOGGAG GTGGOOGCGT OGTGTGGGGG 2800 

GGTTGTIdT GTOGGGTTAA TGGGGCIGAC TCTGIGGOGA TATEACAAGC 2850 

GGTAIAICAG CTGGIGGATG TGGTGGCTIC AGTATITICT GADGAGAGTA 2900 

GAAQCQCAAC TGCAGGIGTG GGT1LUJ LUJ CTGAAGGTOC GGGGGGGGDG 2950 

GGAIGGQGIC ATCTEACICA. TGIGTGTAGT ACACGOGAGC CIGGTATITG 3000 

ACAICADCAA ACEACTCCIG GCCAIUITQG GAOJULTi'lG GATlLTiCAA 3050 

QCCAGTTIGC TEAAAGTGCC CUOTQGIG GGGGTICAAG GOCTICICOG 3100 

GATCIGGGOG CTAGGGGGGA AGAIAGOCQG AGGICATIAC GTGCAAAIGG 3150 

CCATCAICAA GTTAGGGGGG CTTACIGGCA OCTAIGTGIA TAACCATCIC 3200 

ACGGCTCTIC GAGACIGGGC GCACAAGGGG CIGGGAGAIC '1GG OCGIGGC 3250 

TGIGGAAOCA GICGTCTICT CQGGAAIGGA GACCAAGGTC AIGAOGTGGG 3300 

GGGGAGATAC GGOGGCGTGC GGTGACATCA TGAAGGGCTT O-UJJIC'ICT 3350 

QCCGGTAGGG GOCAGGAGAT ACTGCTTGGG GCAGOCGADG GAATGGICIC 3400 

CAAGGGGTGG AGGTIGCIGG CGOOGAICAC GGGGTAGGCC CAQCAGAGGA 3450 

GAGGCCIGCT AGGGIGTATA ATCAGCAGGC TGACIG3CCG GGACAAAAAC 3500 

CAAGTGGAGG GIGAGGTGCA GATOGIGTCA ACIGGTAOX AAAOCTIGCT 3550 

QGCAAGGIGC ATGAAIGGGG TATGCIGGAC 'iGICTACCAC GG3300GGAA 3600 

FIG. I6B 
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QGAG3AOCAT OTCATCAOOC AA033I0CIG TCAIOCfGAT GEATAQCAAT 3650 

GIQ3A0CAAG AG CTIGIQ3 S CIG3000QCT CCICAAQGTET CXXt31TCATT 3700 

GACACQCItJT AGCIQ0GQCT CCT033AOZT TEA0CI03IC AQ3AO3GA0G 3750 

CCGATUICAT TCCGGIQ3X CQGOGAGGIG ATAQCAGG33 TAG0CIGCIT 3800 

iamx333C mrnacm ctigaaagx laciasaoos Giaarrorr 3850 

GIQOCOCOaG QGACAOQOOS TQ333TIMT CAQ33QCX30G GIGIGCADCT 3900 

GIQ3AGIG3C TAAAQ333IG GACITIMCC CIGK33AGAA. CCTAQ33ACA 3950 

ACCAIGAGAT CCC033IGIT CAQ3QACAAC TOC7ICIC3CAC CAGCAGIOOC 4000 

CCAGASTTIC CA03IG33X AOCTQCASQC T0CCAC033: AQ033EAAGA 4050 

QCACCAAG^T CCCQOT333 QGEGTiaSIG 4100 

CICAACGQCT CT5IT G CIQC AA03CTIQ33C TTTQ3IQCIT ACAOGIOCAA. 4150 

Q3CCCA1G33 GITGAIOCm A3XCCAQ3AC G3333IGAGA. ACAATIBaCA 4200 

CIQGCAOCXX: CATCAGGIAC TOZACCEACG GCAAGITCCT TQO0GAO33C 4250 

GQGIGCICAG GAQSIQCTm TGACATAA1A. ATI'IUIGAOG AGIQGCACIC 4300 

CAOQGAIGX ACATCCATCT TG33CATO33 CACIGICCIT GACJCAAQCAG 4350 

AGACIQ0Q33 Q3Q3AGACIG GIT CT GCIC G CCACTCIAC OXTO33QX 4400 

TCQGICACIG TGIOCCAIOC TAACATCGAG GAGSITQCIC TCTOCAOCAC 4450 

CQSAGAGATC CCCTITEAQG GCAAG3CTEAT QOGQCID3W3 GTTGATCAAG3 4500 

Q33GAAGACA TCICAJCITC TGCCfOCAA i^AAGAAGIG QGAOGAQCIC 4550 

OODQ33AAGC TGGTE03CAIT G332A2CAAT ACEACQ3CGG 4600 

ICITCACGIG TCIGICATCC CGAQCAQ033 OGAIGITGIC GTOG TOIOSA 4650 

CCGAIQCICT CA.TGACIGX TTM33333 ACITOGACTC TGIGAIAGAC 4700 

TCCAACAGGT GIGICACICA GACAGIQ3AT TICAOOCTIG AOCTACCIT 4750 

TADCASTGAG ACAAGCAOSC T0003IAGGA TG3D3ICTCC A QGACICAAC 4800 

GCOG33XAG GACIGQCAQG G93AAGOCAG GCA1CTAIAG ATTIGTGGCA 4850 

CX333333AGC GCCCCID333 CAIGTTCGAC 1UJ1UUCTOZ TCIGIGAGIG 4900 

CTMGAOGOG GGCIGIGCTT GGTA3GAG7T CAO3X03X GAGACIACAG 4950 

TTAG3CEACG AGGGIACA1G AACAQ0G033 (JULTlUOOSr GIQOCAGGAC 5000 

OOCTIGAAT TTIG33AG33 0SECTITAQ5 GGOCICACIC ATATAGATOC 5050 

ocA crrrrm tcqcagacaa agcagagigs GjAGaactit q?itmc ig3 5100 

TAGOGIfcDCA AGOCAOOGIG T3D3ZEA033 CICAAQ333C auj-CCA TOS 5150 

TG3GACCAGA TGTG3AAGIG TTIGAIOOX CTEAAAQOCA CCCTXATOG 5200 

QCTAACAOX CTGCIAIACA GACT0333X TGTICAGAAT GAAGICAOOC 5250 

'IGACQCADCC AA.TCACCAAA TACAICAIGA CAJGCAIGTC GX03AOCIG 5300 

GAG3I03TCA O3AQCA0CIG G3IGCI03TT G333333TGC T33CTGCICT 5350 

GGOOGQGTAT TG OJ1U1C AA CAG3J1UXT GGTCATAGIG GGCAGGAIOG 5400 

FIG. 16C 
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T CTIGIQQGG GAAGGQG9CA ATTATACCIG ACAGGGAGGT TCICIAQCAG 5450 

GAGTTOGA1G AGA1QGAAGA. GTGCICICAG CACTEAOGGT ACATOGAGCA 5500 

AGGGATCATG L ' l UXTGAQC AGTICAAGCA GAAGSOCCTC Q30CIGCIQG 5550 

AGAQOGOGTC QGGOCAIGCA GAGGTIATCA OGOL'IGCIGT GCAGA03\AC 5600 

T3GCAGAAAC TDGAGGTCIT TIGGG3GAAG CACAIGTOGA ATTCCATCAG 5650 

1QQ3RTRCAA. TACnGGOGG G UL'IGTCA AC QCIGC3CIQ3T AACJOGGXA 5700 

TIU LTJAJA TT GAIGGCTITT ACAULIOGCG TCAOGAGOGC ACTAAOCACT 5750 

GGCCAAAD3C ' lUL ' lL'l ' lU AA CKEATIOGQG GGGTGQGTQG CIQ30CAGCT 5800 

caacxxxxxx: uji u ug gcea. ciujumur gggtgciggc ctbgciggog 5850 

CCGOCA1GGG CAQ mi ' lUG A CIGGSGAAGG 'lUL'lULflUUA CAT1LT1ULA 5900 

GGGUaQOOG O3930GTQGC Q3GAGCICTT GTIAGCATICA AGATCATGAG 5950 

CGGTGAG3TC 00C3OCAG3G AQGAOJIUGT CAAICIUCTG C00G0CA3CC 6000 

'lL'lUUUC'lGG AQCXJCriGTA. (J1LUJ1G1G G 1CIGGQCAGC AATACIGOGC 6050 

OGGCAOGTCG G0CCG3GOGA GGGGGCAGTG CAAJQGAIGA AG3GGCIAAT 6100 

AG0 LT1UU GC T3G0QGGGGA ACCAIGTTIC OGQCAGGCAC TAOGIGOGGG 6150 

AGAQQGAIOC AGCOGOQQGC GICACIQ02A TACTCAQCAG CCTCACTGTA 6200 

ADXAGCTOC TGAGGCGACT GGATCAGTGG AIAAQCTOGG AGTGTACCAC 6250 

TCCAIUL'IU C GGTICCTGGC TAAGGGACAT C1GGGACTGG AIJOGOGAGG 6300 

TOCTGAGOGA CTITAAGAOC TGGCTGAAAG OCAAGCTCAT GOCACAACTG 6350 

CCTGGGATIC CCTTIGTGTC CTG3ZAQCGC GGGTATAGGG GGGTCTGGOG 6400 

AGGAGAGQOC ATIA1GCACA CILU-'IGGCA CTGTGGAGCT GAGATCACTG 6450 

GACAIGTCAA AAAGGGGAOG A3GAGGATOG TCQGflCTTAG GACCIGCAGG 6500 

AACA3GTQGA GIGGGAQGTT COOCATIAAC GGCTACAOCA OGG GGGGGIG 6550 

TACTCGCCTT QGTGGGCQGA ACTATAAGTT GGOGCTGTGG AQGGTGICIG 6600 

CAGAGGAATA OGTGGAGATA AGG033GIGG GOGACTTOCA CTAOiiAICG 6650 
GGTATGACTA CTGACAATGT TAAAIGCCOG TQGCAGATGC CAIOGCQOGA 6700 
ATITTICACA GAATIGGAGG GGGTGOGOGT ACACAGGTIT QGQGOCGCTT 6750 
GCAAQUUL'iT GCT3G3GGAG GAGGTAICAT TCAGAGTAGG ACICCAOGAG 6800 
TACCOGGIGG GGTOGCAATT AOGTIGGGAG C03GAAD3GG ADGTAGGOGT 6850 
GTIGAOGTOC ATGCTCACTG ATCOCICOGA TATAACAQCA GAGGu^XXXi 6900 
GGAGAAG3TT GGGGAGAGGG TGAQ Q OOCIT CTAIGGQCAG CTOCI03GCT 6950 
AGOCAGCTGT COGCIOCATC TCICAAGGCA ACTIGCACQG CCAAGCAIGA 7000 
CTOCGCIGAC GGOGAGCICA TAGAGGCTAA (JL'lUL'iUlLG AG3CAGGAGA 7050 
TGQGGGGCAA CATCAiXAGG GTTGAGTCAG AGAACAAAGT QGTGATICIG 7100 
GACIGCTTOG ATO3GCTIGT GGCAGAGGAG GATGAGQGQG AGGTCTOGGT 7150 
ACCTGCAGAA ATICTGOGGA AGICTQGGAG ATID3GQQGG GGCCIGOOQG 7200 

FIG. S6D 
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TCIQ330QC3G GQ033ACIAC AAC0G0033CZ TAGIBGAGAC GIQSAAAAfiG 7250 

CCTGACEA03 AACCADCIGT GGI0CAIQ3C TQCDOQCEAC CACCIQZAD3 7300 

GICOCCTOCT GIQ0CIO03C CI033AAAAA GCXjfIM333IG GICCTCAOCG 7350 

AAICAADOCT AICEACIG3: T lUaOO G RGC TTOOCAOCAA Afl Ul'JLTlUU L' 7400 

AQCJIU-'ICAA CTI0333ZAT TAD33333AC AATAOoACAA CAIOJIC'IUA 7450 

G000300QCT IL'IUJL ' IUX OGOOOGaCIC CXaftOGTIGRG TCCEOTCTT 7500 

OCAIG0ODOC UJlUaft QSGG OQCXTO3Q3 AITXQGAICT CW30GAO393 7550 

TCALLUjIOGA COJlC a GTOG T33333CGAC A033AAGAIG TOID3IGCIG 7600 

CICAA1GICT TATTOCIQSA CAG30XACT OGICADOOCG TGO3CIQ0QG 7650 

AAGAACAAAA ACIO00CA3C AA03CACIGA GCAftCTOGIT QCIA03GCAT 7700 

CACAATCIQG 'lUimTCCAC CACITCAOX AGIQCTiaX AAAGQCAGAA 7750 

GAAAGICACA TIT3ACAGAC TQCAAGITCr GGACAQQCAT TAQCAQGACG 7800 

TGCICAAQ3A OJICAAAQCA G03XCTOA AAGIGAAQ3C TAALT1UL'1!A 7850 

TCCCTAGAQ3 AAQCTIOCAG CCIGAOQOOC CCACATICAG CCAAATOCAA 7900 

GinasrmT G333CAAAAG AQGnEQGTIG OCATO32AGA AAGSXEaSG 7950 

GCCACATCAA CTO3IGIG5 AAAGAQCTIC TQGAAGACAG TGTAACAOCA 8000 

ATAGACACEA CCA1CA3QX CAAGAA03AG G'l'i'i'lL'lGQG TICAQCCIGA 8050 

GAA03Q333T CHIAAGGCAG CTOSICI CA T CLJIUI'ICJOCE GAOCTO3303 8100 
TOQQCGTGTG G2AGAAGA2G GODCJIGEADG AOGTCQSTIAG CAAGCTOCXT 8150 
CIQQGG3IGA TO33AAGCTC CTAC3Q3ATIC CAAIACTCAC CAG3ACAQ0G 8200 
GGTIGAATTC CICGIG C AAG Q3IQ3AAGIC CAAGAAGACC COGA1G3QGT 8250 
TCTOGTAIGA TAQQQ3CTGT TTIGACICCA CAGTCACIGA GAGOGACATC 8300 
OJCAQGGAGG AQ3CAAITTA CCAAJCTICT GACCT33A0C O0 CAAGO0O3 8350 
CGIQXCAIC AAGICCCICA CIGAGAGOZT TTAIGTiaQG G3XCICTEA 8400 
CCAATICAAG GGQ33AAAAC TG0Q3CIADC QC2G3IGG0G O30GAQ0QQC 8450 
GEACIGACAA. CIAGCIGIQ3 TAACACOC7IC ACTTGCTACA TCA&u^&S 8500 
QQCAGGCIGT CGAQXGCAG G3CIQCAGGA CIOZAOCATC CI03IGIGIG 8550 
GCGACGACIT AGTOGTTATC TGIGAAAGTG 03LUJJ1CCA OGAQGACQGG 8600 
GCGAGCCIGA GAQOCTICAC GGAGQCIATG ACCAQ3EACT OOGCTOGOGC 8650 
CQ03GACQCC CCACAAOCAG AATAOGACTT G3AGCITAIA A CAICA 1QCT 8700 
CCICCAACCT GICAGTOQOC CAOGAD330G CT3GAAAGAG GCTCTACTAC 8750 
CITACQ03IG ACCCEACAAC OOOUL'lUCPG AGAQ003QGT G3GAGACAGC 8800 
AAGACACACT CCAGTCAATT CCIGGCEAG3 CAACAIAAJC ATCmOTOC 8850 
CCACACTGTG G3GGAGGAIG ATACTGATCA OGCAITiLTl' TAG03IDCIC 8900 
ATAQCCAG33 AICAGCTIGA ACAG3CJTCIT AACIGTGAGA TCTA03GAQC 8950 

cigctacicc atagaaocac tqgatceaoc tccaaicajt CAAAGACTOC 9000 

FIG. I6E 
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A2QGQCICAG GGCATITTCA. CICCACAGIT ACICIOCAQ3 1GAAA3CAAT 9050 

AQ33IQ3QCX3 CA3Q3TICAG AAAACITG33 GiaOOSOOCT TQOGAGCTIG 9100 

GAGACACCQG Q33333AG2G T0O3C3XTAG CTTICIGICC AGAQGAGQCA 9150 

GGGCIQCIM' ATCIG3CAAG TACCICTICA ACIG3GCAGT AAGAACAAAG 9200 

CECAAACTCA. CIOCAATAOZ GG0C D CT3GC 033CJIQGACT TGI0033ITG 9250 

GnCA033CT G32TACAQ0G QQQ3AGACAT TIAICAlZAQC GIGIUICA3G 9300 

CC0333CC0G CIUJI' I UIGG Tl'i'iLCCTAC 'lLUlUC'lCDC 1GCAG333IA 9350 

G3CATCTACC ' 1UCTC00CAA. OQGAIGAAG3 TIQ333EAAA. CACID333GC 9400 

ICTEAAQOCA TnOCTSTTT Tl'l ' l'l ' l ' lTlT TITlTriTIT Tl'lTlUiTlT 9450 

TnTmcrr locrnocrr crnTrrroc Tncrrmc ocncrnAA. 9500 

TGb'lGXHX A TCTTO GOQC TAGICA03X TAOZIGIGAA. AQ3ID2GIGA. 9550 

GCCX3CAIGAC T3CAGAGAGT GCIGAIACIG GCCICICIDZ AGA1CA3GT 9599 

FIG. I6F 
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10 20 30 40 50 

1334567890 1234567890 1234567890 1234567890 1234567890 

MSTNPKPQRK TKRNINRRFQ DVKFP3332I VGGVYLLPRR GPRD3VRAIR 50 

KASER9QPRG KRQPIPKftRR PH3&WAQPG YPWPLYGNEG D3WAQWLLSP 100 

EGSRPSWGPT DPKRRSRNLG KVHHLTOGF AEOEYIPLV GAPLGGAAPA 150 

LAH3VRVLED GVNYAIGNLP QCSFSIFLLA LLSCLTTPAS AYEVENVSGI 200 

YHTINDCSNS SIVYEAADTI MHTPQCVFCV QB3N5SRCW ALTPTLAARN 250 

ASVPimRR HVDLLVGTAA. PCSAMYVGDL OGSIFLVSQL FEFSPRRHET 300 

vqcocsiyp ghvsghrmaw im^wsptt ALWSQLLRI PQAWEMMAG 350 

AHWGVLAGLA YYSfcfiflaNWAK VLIVALLEAG VDGL'lHTiGR VAGHTTSGFT 400 

SLFS9GASQK IQLVNINGSW HENRIRLIOI DSLQIGFEAA. LFYAHKENSS 450 

QL' PK KMASCR PIEWERQQWG PTIYIKENSS I£RPYCWHYA PRPQ3WPAS 500 

CVOGPVYCFT PSPWVGTID RSGVPTYSWG ENEJIU^MLLN NIRPFQC3WF 550 

GCTWNSIGF TKIC03PFCN IGGVGNRILI CPIDZEBKHP EATCTKQGSG 600 

PWLTERCLVD YPYRUWHYPC TLIFSIFKVR MWGGVEHRL NAAQWERGE 650 

RCNLEDRERS ELSPLLLSTT EWQILPCAFT TLPALSIGLI 700 

YLYGVGSAPV SFAIKWEYIL LLFULADAR VGACLIWMLL IAQAEAALEN 750 

LWLNAASVA GAH3ILSFLV FPCAAWYIKG RLAPGAAYAF YGVWPLLLLL 800 

LADPPRAYAL DTEVAASQGG WLVGLMALT LSFYYKRYIS VQM/CQYFL 850 

TRVEAQLHVW VPPLNVRGGR DKVUIiCW HPTLVFDITK T.T.T AIFGPLJW 900 

ILQASLLKVP YFVRVQGLLR ICAIARKIAG GHYV134AIIK IGALTGTYVY 950 

NHLTPLRDWA. HNGLRDLAVA VEPWFSRME TKLITWGADr AAQGDIDJX 1000 

PVSARRQQEI LDSPADGMvS KQWRLLAPIT AYAQ2IRGLL GCHTSLTGR 1050 

DKN2VEGEVQ IVSmTQIFL ATCHSGVOrTT VYHGAGTRTI ASPKGPVI^l 1100 

YINVDQDLVG WPAPQGSRSL TPCTOGSSDL YLVIHHAEK/I FVRRRGDSRG 1150 

SLLSPRPISY LKGSSGGPLL CPAGHAVGLF FAAJvCTRGVA KAVDFTPVEN 1200 

LCTIMRSPVF UNSSPPAVP QSPQVAHLHA PTGSGKSTKV PAAYAAQGYK 1250 

VLVLNPSVAA TLGFGAYMSK AHGVDPNIRT GVKiTi'lGSP nYSTOGKFL 1300 

ADQ3CSQGAY DIIICDEEHS 1DATSILGIG WLDQAEEAG AELWLAIAT 1350 

PPGSVTVSHP NIEEVALSTT GKLPFYGKAI PLEVIKGGRH LTFCHSKKKC 1400 

DELAAKLVAL GINAVAYYFG LD7SVIPT9G DVWVSIDAL MTGFIGDFDS 1450 

VIIXNICVIQ TVDFSLDPTF TIEITTLPQD AVSR3QRRGR TGRGKPGIYR 1500 

FVAPGEFP93 MFDSSVLCEC YDAGCAWYEL TPAEITVRLR AYMtfTPGLPV 1550 

CQIHLEFWEG WIGLTHHA HFLSQIKQSG EttFPYLVAYQ AIVCAFAGAP 1600 

PPSWD3MWKC IZRLKPTLH3 PTPLLYRLGA VQSEVILTHP HKYIMiaflS 1650 

ADLEWISIW VLVQGVLAAL AAYCLSIQCV VTvGRIVLSG KPAIIFEREV 1700 

LYQEFDEMEE CSQHLPYIEQ GMMLAEQFKQ KALG3X£JEAS RHAEVTTPAV 1750 

QINWQKLEVF WAKHMWNFIS GIQYLAGLST LPGNPAIASL MAFTAAVTSP 1800 

LTIGQTLLFN IL33WVAAQL AAPGAATAFV GA3LAGAAIG SVGLGKVLVD 1850 

ILAGYGAGVA GALVAFKE-S GEVPSIEDLV NLLPAILSPG ALWGWCAA. 1900 

FIG. E6G 
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H77CV-J4aa Sequence 



10 20 30 40 50 

1?-U5S7890 1234567890 1234567890 1234567890 1234567890 

TTBPHVnPGE GAVQ^NRKE AEASRQ4HVS PTHYVPESE& AARVEMLSS 1950 

LTVIQLIPRL H3WISSEUIT PC9GSWLRDI WDWICEVLSD FKIWLKMCLM 2000 

PQLPGIPFVS OTGHGVWR GDGIMHIFCH OGAETIO^VK M3IMRIVGPR 2050 

TCRNMrtSSIF PINAYTIGPC TPLPAPNYKF MJWRVSAEEY VEERKVGDEH 2100 

wscainiMj kcrdqipsee ffteldgvrl hrfappckpl lreevsfrug 2150 

LHEYFVG93L PCEPEPIHAV LTSMLTDPSH ITAEAAGKRL ARGSPP31&S 2200 

SSASQLSAPS I^ATCEKNHD SEDRELTEAN LUWRQEM3C3ST TH5VESENKV 2250 

VILDSFDPLV AEEDEREVSV PAEHLRKSRR EARALPVWAR PDYNPPLVET 2300 

VKKPD¥EPPV VH3CPLPPPR SPPVPPPRKK RTWLTESIL STALAELAIK 2350 

SPGSSSISGI 1GENTTES SE PAPSGCPPDS EWESYSSMPP LEEEPGDPDL 2400 

SDGSWSIV5S GADTEXWVOC SMSYSWIGAL VTPCAAEHQK LPHMALSNSL 2450 

LRHHNLVYST TSRSAOQRQK KVTFIBLQVL DSHYQEWLKE VKAAASKVKA 2500 

NLLSVEEACS LTPPHSAKSK HARKAVAHIN SVWKDLLEDS 2550 

vrpurmm knevpcvqpe kqc3?kparli vfpddsvrvc ekmalydws 2600 

KLPLAVM3SS YGPQYSPGQR VEFLVQSWKS KKTPMSFSYD TRCFDSIVrE 2650 

SDIKIEEAIY QCCDLDPQAR VAIKSLTERL YVQGPLTNSR GEN03YRRCR 2700 

ASGVLTTSCG NTLTCYIKAR AACRAAGLQD CIMLVOGDDL WICESAGVQ 2750 

EDAASLRAFT EAMIRYSAPP OTPQPEYDL ELITSCSSNV SVAHDGAGKR 2800 

VYYLTRDPTT PLARAMtfEIA PHTFVNSWLG MIMFAPT1JW ARMELMIHFF 2850 

SVLIAPDQLE QALNCEIYGA CYSIEPLDLP PIIQRLH3LS AFSLHSYSPG 2900 

EINRVAACLR KL3VPPLRAW RHRARSVRAR LLSROGRMI CX2CiT£MtfAV 2950 

KIKLKLTPIA AAGRLDLSGW FTAGYSGGDI YHSVSHARPR WFWP TT LLLA 3000 

AGVGIYLLPN R 3011 

FIG. I6H 
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#1a. 3' Deletion mutants of pCV-H77C 

Sequence of 3' untranslated region of pCV-H77C 
5*UTR 



ORF 



fl'UTR 



-•43nts t Bints , joints 



( 3' variable region; 43 nts ) 3' variable pdyU-UC 3' conserved 

tga (Stop codon for polyproteln) fe * m (e 9™ 

AGGTTGGGGT AAACACTCCG GCCT CTTAAG CCATTTCCTG 

(poly U-UC region; 81 nts) Afl 11 

TTTTTTTTTT TTTTTTTTTT TTTTTTTTCT TTTTTTTTTT CTTTCCTTTC 
CTTCTTTTTT TCCTTTCTTT TTCCCTTCTT T 

(3* conserved region; 101 nts) 

AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 
T 



#1a -1. pCV-H77C(-98X) ; 3* 98 nucleotides removed from pCV-H77C 

TGAA GGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAAT 

#1a -2. pCV-H77C(-42X) ; 3' 42 nucleotides removed from pCV-H77C 

TGA AGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT XTTTTTTTTT XCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGGT GGCTCCATCT TAGCCCTAGT 
CACGGCTAGC TGTGAAAGGT CCGTGAGCCG CAT 

#1a -3. pCV-H77C(X-52) ; All of the 3' UTR sequence, except 3' 49 nucleotides, 
removed from pCV-H77C 

TGAGCCGCAT GACTGCAGAG AGTGCTGATA CTGGCCTCTC TGCAGATCAT 
GT 

FIG. I7A 
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#1a -4. pCV-H77C(X) ; All of the 3' UTR sequence, except 3' 101 nucleotides, 
removed from pCV-H77C 

TGAAATGGTG GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGGTC 
CGTGAGCCGC ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC 
ATGT 

#1a -5. pCV-H77C(+49X) ; The proximal 49 nucleotides of the 3' conserved 
region ( 98 nucleotides ; AAT not included) removed from pCV-H77C 

TGAA GGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGCC GCATGACTGC AGAGAGTGCT 
GATACTGGCC TCTCTGCAGA TCATGT 

#1a -6, pCV-H77C(VR-24) ; First 24 nucleotides of the 3' variable region 
removed from pCV-H77C 

TGA CTTAAGC CATTTCCTGT TTTTTTTTTT TTTTTTTTTT TTTTTTTCTT 
TTTTTTTTTC TTTCCTTTCC TTCTTTTTT? CCTTTCTTTT TCCCTTCTTT 
AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 
T 

#1a -7. pCV-H77C(-U/UC) ; Poly U-UC region removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT rrCGCCTCTT AAGCCATTTC CTGAATGGTG 
GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGGTC CGTGAGCCGC 
ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC ATGT 

FIG. I7B 
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#1b. Strategy of 3' Deletion mutants 

#1b-1.pCV-H77C(-9BX) 



3' variable 
region 



poly U-UC 
region 



3' conserved 
region 

(98nts) 




Xba 1* 



1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Afl ll and Xba I 

4. Cloning of Afl II IXba I fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee ; 11/26/97 and 12/17/97 

8. Result : Negative ( No replication) 



#1b-2. pCV-H77C(-42X) 

3' variable P°'V u ' uc 
region re 9 ,on 

I I 



3' conserved 
region 

I (42nts) 



Afl II (9403) 

S ynthesize * Oligonucleotides 



Nhe\ (9530) 



Xba I* 

1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Nhe I and Xba I 

4. Cloning of Nhe I IXba I fragment into pG9-KL26 (3 1 UTR of H77C) 

5. Sequence analysis 

6. Cloning of 3" UTR ( -42X ) [Afl II IXba 1 fragment) into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee (Schedule; 1/22/98, 2/5/98 ) 



FIG. I7C 
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#1b-3. pCV-H77C(X-52) 

3' variable poly U-UC 

region region 

— ITGAI | 



NS5B 



n — 

We I (9160) 



3' conserved 

region 

| (52nts) | (49nls) 



PfuPQR 



SynthesfzedQHgonucleotides 

IIIIMMMHiB b 



I Fus/on and Extension 



■titmit> - 
inttnii i 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment b ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde l-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 

FIG. I7D 



SUBSTITUTE SHEET (RULE 26) 



WO 99/04008 



A5/A9 



PCT/US98/14688 



#1b-4. pCV-H77C(X) 



poly U*UC 3 f conserved 
region region 
1 (101 nts) 



3' variable 




Pfu PGR 




Fusion and Extension 



a 



MUMIIIIi ► 

«« IIIIIIIIHMBi 



c 



1. Fragment a ; Pfu PGR amplification and purification 

2. Fragment c ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde \-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8- In vitro transcription (within 24 hours of inoculation) 
9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7E 
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#1b-5. pCV-H77C(+49X) 

3' variable 
region 



Pfu PCR 



poly U-UC 
region 



3' conserved 
region 

(49nts) 




Ml\\ (9403) 



1*11111 



Synthesized Oligonucleotides 
• i 
< i 

e 



nun I 



T 



j 



XbaV 



Fusion and Extension 



utim 



mm i 



1. Fragment d ; Pfu PCR amplification and purification 

2. Fragment e ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning AflW-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 

FIG. I7F 
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#1 b -6. pCV-H77C(VR-24) 

3' variable poly U-UC 3' conserved 

region region region 
TGA|(24nts)| I | 



NS5B 



PCR ! * 

Nde I (9160) 
k 



AflW (9403) 



* AflW 
— ► 4-^- 

1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Nde I and Aft I 

4. Cloning of Nde I I AflW fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee 



#1b-7. pCV-H77C(-U/UC) 

3' variable 
region 

I I 



poly U-UC 
region 



3' conserved 
region 
(31 nts) | 



Miel(9530) 



AflW (9403) 
Synthesized Oligonucleotides !!!!!" 

AflW 



1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Afl II and A/he I 

4. Cloning of Afl II and Nhe I fragment into pG9-KL26 

5. Sequence analysis 

6. Cloning of 3' UTR ( -poly U-UC ) [Afl II IXba I fragment] into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 

FIG. I7G 
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